pg11分区索引使用总结
在 PostgreSQL 10 中,分区上的索引需要基于各个分区手动创建,而不能基于分区的父表创建索引。PostgreSQL 11 可以基于分区表创建索引。分区表上的索引并不会创建一个物理上的索引,而是为每个分区上的索引创建一个模板.
分区自动索引
如果在分区表上创建了一个索引,PostgreSQL自动为每个分区创建具有相同属性的索引,其主表本身并没有自身索引。
自动创建的索引,名称按照 “{partition name}_{column name}idx” 的模式定义。多个字段的复合索引使用下划线()连接字段名称。如果索引名称已经存在,在名称的最后添加一个数字。如果名称过长,使用缩写。
CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);
--建子分区measurement_y2018
CREATE TABLE measurement_y2018 PARTITION OF measurement FOR VALUES FROM ('2018-01-01') TO ('2019-01-01');
--建子分区measurement_y2019
CREATE TABLE measurement_y2019 PARTITION OF measurement FOR VALUES FROM ('2019-01-01') TO ('2020-01-01');
--在主表非分区列建索引
CREATE INDEX idx_measurement_peaktemp ON measurement(peaktemp);
CREATE INDEX idx_measurement_peaktemp_city_id ON measurement(city_id);
--在主表分区表分区列建索引
CREATE INDEX idx_measurement_peaktemp_1 ON measurement(logdate,peaktemp);
--在子分区表分区列建索引
CREATE INDEX idx_measurement_peaktemp_y2018 ON measurement_y2018(logdate);
--在子分区表非分区列建索引
CREATE INDEX idx_measurement_peaktemp_y2018_2 ON measurement_y2018(peaktemp);
--在子分区表非分区列建复合索引
CREATE INDEX idx_measurement_peaktemp_y2018_1 ON measurement_y2018(logdate,peaktemp);
查询数据库索引如下:
postgres=# select * from pg_indexes where tablename like '%measurement%' order by tablename;
schemaname | tablename | indexname | tablespace | indexdef
------------+-------------------+----------------------------------------+------------+----------------------------------------------------------------
-------------------------------------------------
public | measurement_y2018 | idx_measurement_peaktemp_y2018 | | CREATE INDEX idx_measurement_peaktemp_y2018 ON public.measureme
nt_y2018 USING btree (logdate)
public | measurement_y2018 | idx_measurement_peaktemp_y2018_2 | | CREATE INDEX idx_measurement_peaktemp_y2018_2 ON public.measure
ment_y2018 USING btree (peaktemp)
public | measurement_y2018 | idx_measurement_peaktemp_y2018_1 | | CREATE INDEX idx_measurement_peaktemp_y2018_1 ON public.measure
ment_y2018 USING btree (logdate, peaktemp)
public | measurement_y2018 | measurement_y2018_logdate_peaktemp_idx | | CREATE INDEX measurement_y2018_logdate_peaktemp_idx ON public.m
easurement_y2018 USING btree (logdate, peaktemp)
public | measurement_y2018 | measurement_y2018_peaktemp_idx | | CREATE INDEX measurement_y2018_peaktemp_idx ON public.measureme
nt_y2018 USING btree (peaktemp)
public | measurement_y2019 | measurement_y2019_logdate_peaktemp_idx | | CREATE INDEX measurement_y2019_logdate_peaktemp_idx ON public.m
easurement_y2019 USING btree (logdate, peaktemp)
public | measurement_y2019 | measurement_y2019_peaktemp_idx | | CREATE INDEX measurement_y2019_peaktemp_idx ON public.measureme
nt_y2019 USING btree (peaktemp)
(7 rows)
以_idx结尾的为由主表为分区表自动创建的索引
随后新增的分区或者通过 ATTACH PARTITION 挂载的分区都会自动创建相应的索引。
CREATE TABLE measurement_y2020 (LIKE measurement);
ALTER TABLE measurement ATTACH PARTITION measurement_y2020
FOR VALUES FROM ('2020-01-01') TO ('2021-01-01');
postgres=# \d measurement_y2020
Table "public.measurement_y2020"
Column | Type | Collation | Nullable | Default
-----------+---------+-----------+----------+---------
city_id | integer | | not null |
logdate | date | | not null |
peaktemp | integer | | |
unitsales | integer | | |
Partition of: measurement FOR VALUES FROM ('2020-01-01') TO ('2021-01-01')
Indexes:
"measurement_y2020_logdate_peaktemp_idx" btree (logdate, peaktemp)
"measurement_y2020_peaktemp_idx" btree (peaktemp)
自动创建的索引不能单独删除,可以通过分区表统一删除。
postgres=# DROP INDEX measurement_y2020_peaktemp_idx;
ERROR: cannot drop index measurement_y2020_peaktemp_idx because index idx_measurement_peaktemp requires it
HINT: You can drop index idx_measurement_peaktemp instead.
分区表唯一约束
对于 PostgreSQL 10,只能基于分区创建唯一约束(PRIMARY KEY 和 UNIQUE KEY),而不能针对分区的父表创建唯一约束。PostgreSQL 11 支持分区表上的唯一约束。
CREATE TABLE rtable(c1 INT, c2 VARCHAR(10)) PARTITION BY RANGE(c1);
ALTER TABLE rtable ADD CONSTRAINT pk_rtable PRIMARY KEY(c1);
\d rtable
Table "public.rtable"
Column | Type | Collation | Nullable | Default
--------+-----------------------+-----------+----------+---------
c1 | integer | | not null |
c2 | character varying(10) | | |
Partition key: RANGE (c1)
Indexes:
"pk_rtable" PRIMARY KEY, btree (c1)
Number of partitions: 0
添加分区或者加载(ATTACH)分区时自动创建相应的主键:
CREATE TABLE rtable100 PARTITION OF rtable FOR VALUES FROM (1) TO (100);
\d rtable100
Table "public.rtable100"
Column | Type | Collation | Nullable | Default
--------+-----------------------+-----------+----------+---------
c1 | integer | | not null |
c2 | character varying(10) | | |
Partition of: rtable FOR VALUES FROM (1) TO (100)
Indexes:
"rtable100_pkey" PRIMARY KEY, btree (c1)
如果在分区表上创建了唯一约束,无法再创建基于外部表(FOREIGN TABLE)的分区。因为无法为外部表创建唯一约束。
CREATE FOREIGN TABLE rtable200 PARTITION OF rtable FOR VALUES FROM (101) TO (200) SERVER remote1;
ERROR: cannot create index on foreign table "rtable200"
主键约束或唯一约束必须包含分区字段。这样才能确保整个分区表内的唯一性,因为每个分区上的唯一约束只维护自身的唯一性。
CREATE TABLE rtable1(c1 INT, c2 VARCHAR(10)) PARTITION BY RANGE(c1);
ALTER TABLE rtable1 ADD CONSTRAINT pk_table1 PRIMARY KEY(c2);
ERROR: insufficient columns in PRIMARY KEY constraint definition
DETAIL: PRIMARY KEY constraint on table "rtable1" lacks column "c1" which is part of the partition key.
分区表+索引查询分析
查询条件无分区键,全表扫描,各个子分区都会查询一遍
postgres=# explain select * from measurement where city_id =1;
QUERY PLAN
--------------------------------------------------------------------------------------------------
Append (cost=4.22..44.43 rows=27 width=16)
-> Bitmap Heap Scan on measurement_y2018 (cost=4.22..14.76 rows=9 width=16)
Recheck Cond: (city_id = 1)
-> Bitmap Index Scan on measurement_y2018_city_id_idx (cost=0.00..4.22 rows=9 width=0)
Index Cond: (city_id = 1)
-> Bitmap Heap Scan on measurement_y2019 (cost=4.22..14.76 rows=9 width=16)
Recheck Cond: (city_id = 1)
-> Bitmap Index Scan on measurement_y2019_city_id_idx (cost=0.00..4.22 rows=9 width=0)
Index Cond: (city_id = 1)
-> Bitmap Heap Scan on measurement_y2020 (cost=4.22..14.76 rows=9 width=16)
Recheck Cond: (city_id = 1)
-> Bitmap Index Scan on measurement_y2020_city_id_idx (cost=0.00..4.22 rows=9 width=0)
Index Cond: (city_id = 1)
(13 rows)
查询条件包括分区键(logdate),先定位具体的分区子表,再在子表进行索引查询
postgres=# explain select * from measurement where city_id =1 and logdate ='2019-03-01';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Append (cost=8.69..12.71 rows=1 width=16)
-> Bitmap Heap Scan on measurement_y2019 (cost=8.69..12.71 rows=1 width=16)
Recheck Cond: ((city_id = 1) AND (logdate = '2019-03-01'::date))
-> BitmapAnd (cost=8.69..8.69 rows=1 width=0)
-> Bitmap Index Scan on measurement_y2019_city_id_idx (cost=0.00..4.22 rows=9 width=0)
Index Cond: (city_id = 1)
-> Bitmap Index Scan on measurement_y2019_logdate_peaktemp_idx (cost=0.00..4.22 rows=9 width=0)
Index Cond: (logdate = '2019-03-01'::date)