MaxCompute:阿里大数据计算服务浅析（SQL）

“昔之得一者：天一以清，地得一以宁，神得一以灵，谷得一以盈，侯王得一以为天下正，其至也，谓：天毋已清将恐裂，地毋已宁将恐发；神毋已灵将恐歇，故毋已盈将恐竭，侯王毋已贵以高将恐蹶。
故必贵而以贱为本，必高矣而以下为基。
夫是以侯王自谓孤、寡、不榖。
此其贱之本与，非也？
故致数与无与。
是故不欲禄禄如玉，珞珞如石。”¹

MaxCompute

MaxCompute，前身ODPS(Open Data Processing Service)，是阿里巴巴通用计算平台提供的一种快速、完全托管的 GB/TB/PB 级数据仓库解决方案，MaxCompute 向用户提供了完善的数据导入方案以及多种经典的分布式计算模型，能够更快速的解决用户海量数据计算问题。

本篇参考maxcompute官方文档而来，有兴趣可以前往。

SQL

maxcompute sql可以看作是标准sql的子集，但是也有差异。

运算符

操作符说明

like 通配符
如果A或B为NULL，返回NULL，A为字符串，B为要匹配的模式，如果匹配，返回TRUE，否则返回FALSE。’%’匹配任意多个字符，’_‘匹配单个字符。要匹配’%’或’_’需要用转义符表示’%’，’_’。

rlike 正则；A是字符串，B是字符串常量正则表达式；如果匹配成功，返回TRUE，否则返回FALSE；如果B为空串会报错退出；如果A或B为NULL，返回NULL；

A & B 返回A与B进行按位与的结果。例如：1&2返回0，1&3返回1，NULL与任何值按位与都为NULL。 A和B必须为Bigint类型。

A | B 返回A与B进行按位或的结果。例如：1 |2返回3，1 |3返回3，NULL与任何值按位或都为NULL。 A和B 必须为Bigint类型。

1	select * from user where user_name like '%goshine_user%';
2	select * from user where user_name rlike '^goshine_user*';

关于正则表达式的笔记请参考我的另一篇笔记《从java(python)到scala的n种记忆》

由于 double 值存在一定的精度差，因此，不建议您直接使用等号对两个 double 类型的数据进行比较。您可以使用两个 double 类型相减，然后取绝对值的方式进行判断。当绝对值足够小时，认为两个 double 数值相等。

1	abs(0.9999999999 - 1.0000000000) < 0.000000001
2	-- 0.9999999999和1.0000000000为10位精度，而0.000000001为9位精度。
3	-- 此时可以认为0.9999999999和1.0000000000相等。

类型转换

分为隐式类型转换和显示类型转换，显示使用cast：

1	select case(user_id as double) as new_id from user;
2	select cast('2015-10-01 00:00:00' as datetime) as new_date from user;

date和string类型之间的转换
严格按照格式形如：yyyy-mm-dd hh:mi:ss，除正常的转换之外，妖孽一点的可以用TO_DATE:

1	to_date('阿里巴巴2010-1203', '阿里巴巴yyyy-mmdd') = 2010-12-03 00:00:00
2	to_date('20080718', 'yyyymmdd') = 2008-07-18 00:00:00
3	to_date('200807182030','yyyymmddhhmi')=2008-07-18 20:30:00
4	to_date('2008718', 'yyyymmdd')
5	-- 格式不符合，引发异常
6	to_date('阿里巴巴2010-123', '阿里巴巴yyyy-mmdd')
7	-- 格式不符合，引发异常
8	to_date('2010-24-01', 'yyyy')
9	-- 格式不符合，引发异常

同时maxcompute提供了一些列其他的内建函数（DATEADD、DATEDIFF、DATEPART、 DATETRUNC、FROM_UNIXTIME、GETDATE、ISDATE、LASTDAY等），一定会经常用到，可以去看看。

DDL语句

创建表
要加if not exists!

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[STORED BY StorageHandler] -- 仅限外部表
[WITH SERDEPROPERTIES (Options)] -- 仅限外部表
[LOCATION OSSLocation];-- 仅限外部表
[LIFECYCLE days]
[AS select_statement]
CREATE TABLE [IF NOT EXISTS] table_name
LIKE existing_table_name

一张表最多允许60000个分区，单表的分区层次不能超过6级。
lifecycle表的生命周期，单位：天。create table like语句不会复制源表的生命周期属性。
创建一张表：

create table if not exists sale_detail(
 shop_name     string,
 customer_id   string,
 total_price   double)
 partitioned by (sale_date string,region string);
 -- 创建一张分区表 sale_detail

复制表，方式一：建表的同时将数据复制到新表

1	create table sale_detail_ctas1 as
2	select * from sale_detail;

创建的表不会复制分区属性，只会把源表的分区列作为目标表的一般列处理，即sale_detail_ctas1是一个含有5列的非分区表。
复制表，方式二：指定列的名字

create table sale_detail_ctas2 as
        select shop_name,
            customer_id,
            total_price,
            '2013' as sale_date,
            'China' as region
        from sale_detail;

复制表，方式三：不指定列的名字

create table sale_detail_ctas3 as
        select shop_name,
            customer_id,
            total_price,
            '2013',
            'China'
        from sale_detail;

创建的表sale_detail_ctas3的第四、五列会是类似_c5，_c6
复制表，方式三：表和目标表具有相同的表结构

1	create table sale_detail_like like sale_detail;

注意：貌似，建表的时候只是指定了列的类型，并没有指定列的长度。

查看表信息

1	desc <table_name>;
2	desc extended <table_name>;--查看外部表信息

删除表
要加if exists!
1
DROP TABLE [IF EXISTS] table_name;
重命名表
1
ALTER TABLE table_name RENAME TO new_table_name;
^o^,逻辑有if exists的判断，但是形式上却没有嵌入的语法。

修改表的注释

1	ALTER TABLE table_name SET COMMENT 'tbl comment';

修改表的修改时间
MaxCompute SQL提供touch操作用来修改表的LastDataModifiedTime。效果会将表的LastDataModifiedTime修改为当前时间。此操作会改变表的LastDataModifiedTime的值，此时，MaxCompute会认为表的数据有变动，生命周期的计算会重新开始。
1
ALTER TABLE table_name TOUCH;
清空非分区表里的数据
将指定的非分区表中的数据清空，该命令不支持分区表。对于分区表，可以用ALTER TABLE table_name DROP PARTITION的方式将分区里的数据清除。
1
TRUNCATE TABLE table_name;

修改表的生命周期
MaxCompute提供数据生命周期管理功能，以方便您释放存储空间，简化回收数据的流程。

1	ALTER TABLE table_name SET lifecycle days;

如：

1	create table test_lifecycle(key string) lifecycle 100;
2	-- 新建test_lifecycle表，生命周期为100天。
3	alter table test_lifecycle set lifecycle 50;
4	-- 修改test_lifecycle表，将生命周期设为50天。

禁止生命周期
某些情况下，部分特定的分区不希望被生命周期功能自动回收掉，比如一个月的月初或双十一期间的数据，此时您可以禁止该分区被生命周期功能回收。
1
ALTER TABLE table_name [partition_spec] ENABLE|DISABLE LIFECYCLE;
如：
1
ALTER TABLE trans PARTITION(dt='20141111') DISABLE LIFECYCLE;
创建视图
要加if not exists!

视图没有分区（partition）的概念

1	CREATE [OR REPLACE] VIEW [IF NOT EXISTS] view_name
2	[(col_name [COMMENT col_comment], ...)]
3	[COMMENT view_comment]
4	[AS select_statement]

1.视图只能包含一个有效的select语句。
2.不允许向视图写入数据，例如使用insert into或者insert overwrite操作视图。
3.当建好视图后，如果视图的引用表发生了变更，有可能导致视图无法访问，例如删除被引用表。您需要自己维护引用表及视图之间的对应关系。

如：

1	create view if not exists sale_detail_view
2	(store_name, customer_id, price, sale_date, region)
3	comment 'a view for table sale_detail'
4	as select * from sale_detail;

删除视图
要加if exists!
1
DROP VIEW [IF EXISTS] view_name;

重命名视图

1	ALTER VIEW view_name RENAME TO new_view_name;

如：

create view if not exists sale_detail_view
        (store_name, customer_id, price, sale_date, region)
        comment 'a view for table sale_detail'
        as select * from sale_detail;
alter view sale_detail_view rename to market;

添加分区
1
ALTER TABLE TABLE_NAME ADD [IF NOT EXISTS] PARTITION partition_spec
2
partition_spec:
3
: (partition_col1 = partition_col_value1, partition_col2 = partiton_col_value2, ...)
1.仅支持新增分区，不支持新增分区字段。
2.目前MaxCompute单表支持的分区数量上限为6万。
3.对于多级分区的表，如果想添加新的分区，必须指明全部的分区值。

如：

alter table sale_detail add if not exists partition (sale_date='201312', region='hangzhou');
        -- 成功添加分区，用来存储2013年12月杭州地区的销售记录。
alter table sale_detail add if not exists partition (sale_date='201312', region='shanghai');
        -- 成功添加分区，用来存储2013年12月上海地区的销售记录。
alter table sale_detail add if not exists partition(sale_date='20111011');
        -- 仅指定一个分区sale_date，出错返回
alter table sale_detail add if not exists partition(region='shanghai');
        -- 仅指定一个分区region，出错返回

删除分区

1	ALTER TABLE TABLE_NAME DROP [IF EXISTS] PARTITION partition_spec;
2	partition_spec:
3	: (partition_col1 = partition_col_value1, partition_col2 = partiton_col_value2, ...)

如：

1	alter table sale_detail drop if exists partition(sale_date='201312',region='hangzhou');
2	-- 成功删除2013年12月杭州分区的销售。

添加列

1	ALTER TABLE table_name ADD COLUMNS (col_name1 type1, col_name2 type2...)

添加的新列不支持指定顺序，默认在最后一列。

修改列名

1	ALTER TABLE table_name CHANGE COLUMN old_col_name RENAME TO new_col_name;

修改列、分区注释

1	ALTER TABLE table_name CHANGE COLUMN col_name COMMENT comment_string;

同时修改列名及列注释

1	ALTER TABLE table_name CHANGE COLUMN old_col_name new_col_name column_type COMMENT column_comment;

修改表、分区的修改时间
MaxCompute SQL提供touch操作用来修改分区的LastDataModifiedTime。效果会将分区的LastDataModifiedTime修改为当前时间。
1
ALTER TABLE table_name TOUCH PARTITION(partition_col='partition_col_value', ...);

此操作会改变表的LastDataModifiedTime的值，此时，MaxCompute会认为表或分区的数据有变动，生命周期的计算会重新开始。

修改分区值
MaxCompute SQL支持通过rename操作更改对应表的分区值。

ALTER TABLE table_name PARTITION (partition_col1 = partition_col_value1, partition_col2 = partiton_col_value2, ...)

RENAME TO PARTITION (partition_col1 = partition_col_newvalue1, partition_col2 = partiton_col_newvalue2, ...);

Insert操作

更新表中的数据 insert overwrite/into

1	INSERT OVERWRITE\|INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)] [(col1,col2 ...)]
2	select_statement
3	FROM from_statement;

场景：在MaxCompute SQL处理数据的过程中，Insert overwrite/into用于将计算的结果保存目标表中，以供下一步计算使用。
overwrite/into区别：Insert into会向表或表的分区中追加数据，而Insert overwrite则会在向表或分区中插入数据前清空表中的原有数据。
如：

1	create table sale_detail_insert like sale_detail;
2	alter table sale_detail_insert add partition(sale_date='2013', region='china');
3	insert overwrite table sale_detail_insert partition (sale_date='2013', region='china')
4	select shop_name, customer_id, total_price from sale_detail;

注意：在进行Insert更新数据操作时，源表与目标表的对应关系依赖于在select子句中列的顺序，而不是表与表之间列名的对应关系：

insert overwrite table sale_detail_insert partition (sale_date='2013', region='china')
        select customer_id, shop_name, total_price from sale_detail;
    -- 在创建sale_detail_insert表时，列的顺序为：
    -- shop_name string, customer_id string, total_price bigint
    -- 而从sale_detail向sale_detail_insert插入数据是，sale_detail的插入顺序为：
    -- customer_id, shop_name, total_price
    -- 此时，会将sale_detail.customer_id的数据插入sale_detail_insert.shop_name
    -- 将sale_detail.shop_name的数据插入sale_detail_insert.customer_id

非法语句，场景一：向某个分区插入数据时，分区列不允许出现在select列表中：

insert overwrite table sale_detail_insert partition (sale_date='2013', region='china')

        select shop_name, customer_id, total_price, sale_date, region  from sale_detail;

    -- 报错返回，sale_date，region 为分区列，不允许出现在静态分区的 insert 语句中。

非法语句，场景二：partition的值只能是常量，不可以出现表达式。

1	insert overwrite table sale_detail_insert partition (sale_date=datepart('2016-09-18 01:10:00', 'yyyy') , region='china')
2	select shop_name, customer_id, total_price from sale_detail;

多路输出 Multi insert
MaxCompute SQL支持在一个语句中插入不同的结果表或者分区。

FROM from_statement
        INSERT OVERWRITE | INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)]
            select_statement1 [FROM from_statement]
        [INSERT OVERWRITE | INTO TABLE tablename2 [PARTITION (partcol1=val3, partcol2=val4 ...)]
            select_statement2 [FROM from_statement]]

注意：对于同一张分区表的不同分区，不能同时有Insert overwrite和Insert into操作，否则报错返回。

create table sale_detail_multi like sale_detail;

from sale_detail
        insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
            select shop_name, customer_id, total_price where .....
        insert overwrite table sale_detail_multi partition (sale_date='2011', region='china' )
            select shop_name, customer_id, total_price where .....;
    -- 成功返回，将 sale_detail 的数据插入到 sales 里的 2010 年及 2011 年中国大区的销售记录中。

from sale_detail
        insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
            select shop_name, customer_id, total_price
        insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
            select shop_name, customer_id, total_price;
    -- 出错返回，同一分区出现多次。

from sale_detail
        insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
            select shop_name, customer_id, total_price
        insert into table sale_detail_multi partition (sale_date='2011', region='china' )
            select shop_name, customer_id, total_price;
    -- 出错返回，同一张表的不同分区，不能同时有 insert overwrite 和 insert into 操作。

输出到动态分区 Dynamic partition
在Insert overwrite到一张分区表时，可以在语句中指定分区的值。也可以用另外一种更加灵活的方式，在分区中指定一个分区列名，但不给出值。相应地，在select子句中的对应列来提供分区的值。

1	insert overwrite table tablename partition (partcol1, partcol2 ...) select_statement from from_statement;

如：

1	create table total_revenues (revenue bigint) partitioned by (region string);
2	insert overwrite table total_revenues partition(region)
3	select total_price as revenue, region
4	from sale_detail;

在SQL运行之前，是不知道会产生哪些分区的，只有在select运行结束后，才能由region字段产生的值确定会产生哪些分区，这也是叫做动态分区的原因。

“一个顺序”需要谨记：

create table sale_detail_dypart like sale_detail;--创建示例目标表

insert overwrite table sale_detail_dypart partition (sale_date, region)
        select shop_name,customer_id,total_price,sale_date,region from sale_detail;
    -- 成功返回;

此时sale_detail表中，sale_date的值决定目标表的sale_date分区值，region的值决定目标表的region分区值。约定俗成select的最后的字段为partition的字段，双发的值只与其顺序有关！

Values
通常在业务测试阶段，需要给一个小数据表准备些基本数据，您可以通过INSERT … VALUES的方法快速对测试表写入一些测试数据。

1	INSERT INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)][co1name1,colname2...] VALUES (col1_value,col2_value,...)[,(col1_value,col2_value,...),...]

场景一，插入整行数据：

1	drop table if exists srcp;
2	create table if not exists srcp (key string ,value bigint) partitioned by (p string);
3	insert into table srcp partition (p='abc') values ('a',1),('b',2),('c',3);
4
5	--
6	+-----+------------+---+
7	\| key \| value \| p \|
8	+-----+------------+---+
9	\| a \| 1 \| abc \|
10	\| b \| 2 \| abc \|
11	\| c \| 3 \| abc \|
12	+-----+------------+---+

场景二，插入部分列：

1	drop table if exists srcp;
2	create table if not exists srcp (key string ,value bigint) partitioned by (p string);
3	insert into table srcp partition (p)(key,p) values ('d','20170101'),('e','20170101'),('f','20170101');
4
5	--
6	+-----+------------+---+
7	\| key \| value \| p \|
8	+-----+------------+---+
9	\| d \| NULL \| 20170101 \|
10	\| e \| NULL \| 20170101 \|
11	\| f \| NULL \| 20170101 \|
12	+-----+------------+---+

对于在values中没有制定的列，可以看到取缺省值为NULL。插入列表功能不一定和values一起用，对于Insert into…select…，同样可以使用。

Insert…values有一个限制：values必须是常量，但是有时候希望在插入的数据中进行一些简单的运算，此时可以使用MaxCompute的values table功能，详情见场景三。

场景三，动态插入：

1	drop table if exists srcp;
2	create table if not exists srcp (key string ,value bigint) partitioned by (p string);
3	insert into table srcp partition (p) select concat(a,b), length(a)+length(b),'20170102' from values ('d',4),('e',5),('f',6) t(a,b);
4
5	--
6	+-----+------------+---+
7	\| key \| value \| p \|
8	+-----+------------+---+
9	\| d4 \| 2 \| 20170102 \|
10	\| e5 \| 2 \| 20170102 \|
11	\| f6 \| 2 \| 20170102 \|
12	+-----+------------+---+

其中的values (…), (…) t (a, b)，相当于定义了一个名为t，列为a，b的表，类型为（a string，b bigint），其中的类型从values列表中推导。这样在不准备任何物理表的时候，可以模拟一个有任意数据的，多行的表，并进行任意运算。

Select操作

SELECT [ALL | DISTINCT] select_expr, select_expr, ...
        FROM table_reference
        [WHERE where_condition]
        [GROUP BY col_list]
        [ORDER BY order_condition]
        [DISTRIBUTE BY distribute_condition [SORT BY sort_condition] ]
        [LIMIT number]

设置分区条件来指定扫描的分区!，当需要对分区进行全表扫描的时候，加上set odps.sql.allow.fullscan=true;执行的时候，set语句和sql语句一起提交执行。如：

1	set odps.sql.allow.fullscan=true;
2	select * from sale_detail;

如果需要整个项目都允许全表扫描，可以通过开关自行打开或关闭(true/false)，命里如下：

1	setproject odps.sql.allow.fullscan=true;

distribute by:对数据按照某几列的值做 hash 分片，必须使用 Select 的输出列别名(当没有别名时用列名，当有别名时必须用别名，否则报错！)。
sort by:局部排序，语句前必须加 distribute by。实际上 sort by 是对 distribute by 的结果进行局部排序。必须使用 Select 的输出列别名。
order by:对所有数据按照某几列进行全局排序，order by 必须与 limit 共同使用。

select 语序

1	SELECT key, max(value) FROM src t WHERE value > 0 GROUP BY key HAVING sum(value) > 100 ORDER BY key LIMIT 100;

实际上的逻辑执行顺序是FROM->WHERE->GROUY BY->HAVING->SELECT->ORDER BY->LIMIT。
MaxCompute 支持以执行顺序书写查询语句，例如上面的语句可以写为：

1	FROM src t WHERE value > 0 GROUP BY key HAVING sum(value) > 100 SELECT key, max(value) ORDER BY key LIMIT 100;

子查询

子查询必须有别名。
在 from 子句中，子查询可以当作一张表来使用，与其它的表或子查询进行 Join 操作，如下所示：

1	create table shop as select * from sale_detail;
2	select a.shop_name, a.customer_id, a.total_price from
3	(select * from shop) a join sale_detail on a.shop_name = sale_detail.shop_name;

1:老子《道德经》第三十九章，老子故里，中国鹿邑。

1	CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
2	[(col_name data_type [COMMENT col_comment], ...)]
3	[COMMENT table_comment]
4	[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
5	[STORED BY StorageHandler] -- 仅限外部表
6	[WITH SERDEPROPERTIES (Options)] -- 仅限外部表
7	[LOCATION OSSLocation];-- 仅限外部表
8	[LIFECYCLE days]
9	[AS select_statement]
10	CREATE TABLE [IF NOT EXISTS] table_name
11	LIKE existing_table_name

1	create table if not exists sale_detail(
2	shop_name string,
3	customer_id string,
4	total_price double)
5	partitioned by (sale_date string,region string);
6	-- 创建一张分区表 sale_detail

1	create table sale_detail_ctas2 as
2	select shop_name,
3	customer_id,
4	total_price,
5	'2013' as sale_date,
6	'China' as region
7	from sale_detail;

1	create table sale_detail_ctas3 as
2	select shop_name,
3	customer_id,
4	total_price,
5	'2013',
6	'China'
7	from sale_detail;

1	ALTER TABLE TABLE_NAME ADD [IF NOT EXISTS] PARTITION partition_spec
2	partition_spec:
3	: (partition_col1 = partition_col_value1, partition_col2 = partiton_col_value2, ...)

1	alter table sale_detail add if not exists partition (sale_date='201312', region='hangzhou');
2	-- 成功添加分区，用来存储2013年12月杭州地区的销售记录。
3	alter table sale_detail add if not exists partition (sale_date='201312', region='shanghai');
4	-- 成功添加分区，用来存储2013年12月上海地区的销售记录。
5	alter table sale_detail add if not exists partition(sale_date='20111011');
6	-- 仅指定一个分区sale_date，出错返回
7	alter table sale_detail add if not exists partition(region='shanghai');
8	-- 仅指定一个分区region，出错返回

1	insert overwrite table sale_detail_insert partition (sale_date='2013', region='china')
2	select customer_id, shop_name, total_price from sale_detail;
3	-- 在创建sale_detail_insert表时，列的顺序为：
4	-- shop_name string, customer_id string, total_price bigint
5	-- 而从sale_detail向sale_detail_insert插入数据是，sale_detail的插入顺序为：
6	-- customer_id, shop_name, total_price
7	-- 此时，会将sale_detail.customer_id的数据插入sale_detail_insert.shop_name
8	-- 将sale_detail.shop_name的数据插入sale_detail_insert.customer_id

1	FROM from_statement
2	INSERT OVERWRITE \| INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)]
3	select_statement1 [FROM from_statement]
4	[INSERT OVERWRITE \| INTO TABLE tablename2 [PARTITION (partcol1=val3, partcol2=val4 ...)]
5	select_statement2 [FROM from_statement]]

1	create table sale_detail_multi like sale_detail;
2
3	from sale_detail
4	insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
5	select shop_name, customer_id, total_price where .....
6	insert overwrite table sale_detail_multi partition (sale_date='2011', region='china' )
7	select shop_name, customer_id, total_price where .....;
8	-- 成功返回，将 sale_detail 的数据插入到 sales 里的 2010 年及 2011 年中国大区的销售记录中。
9
10	from sale_detail
11	insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
12	select shop_name, customer_id, total_price
13	insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
14	select shop_name, customer_id, total_price;
15	-- 出错返回，同一分区出现多次。
16
17	from sale_detail
18	insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
19	select shop_name, customer_id, total_price
20	insert into table sale_detail_multi partition (sale_date='2011', region='china' )
21	select shop_name, customer_id, total_price;
22	-- 出错返回，同一张表的不同分区，不能同时有 insert overwrite 和 insert into 操作。

1	create table sale_detail_dypart like sale_detail;--创建示例目标表
2
3	insert overwrite table sale_detail_dypart partition (sale_date, region)
4	select shop_name,customer_id,total_price,sale_date,region from sale_detail;
5	-- 成功返回;

1	SELECT [ALL \| DISTINCT] select_expr, select_expr, ...
2	FROM table_reference
3	[WHERE where_condition]
4	[GROUP BY col_list]
5	[ORDER BY order_condition]
6	[DISTRIBUTE BY distribute_condition [SORT BY sort_condition] ]
7	[LIMIT number]