💃🏻 📥 🎼 PostgreSQLでのパーティション分割-何？なんで？どうやって？ 👏🏻 😑 🐢

残念ながら、PostgreSQLのテーブルをパーティション分割する機能は、これまで多くの人が積極的に使用していません。 私の意見では、Hubert Lubaczewski（ depesz.com ）は彼女の仕事について非常に価値があると語っています。 彼の記事の別の翻訳を提供します！

最近、私はパーティション分割を使用できるケースに出くわすことが多いことに気付きました。そして、理論的には、ほとんどの人はその存在を知っていますが、実際、この機能はよく理解されておらず、それを恐れている人もいます。

それで、私の知識と能力を最大限に説明し、それが何であるのか、なぜそれを使うべきなのか、どうやってやるのかを説明しようと思います。

ご存知のとおり、PostgreSQLにはテーブルがあり、テーブルにはデータがあります。場合によってはほんの数行であり、時には数十億行です。

パーティショニングは、（列ではなくレコードの数に基づいて）大きなテーブルを多くの小さなテーブルに分割する方法です。そして、これはアプリケーションに対して透過的な方法で行われることが望ましいです。

PostgreSQLでめったに使用されない機能の1つは、PostgreSQLがオブジェクトリレーショナルデータベースであることです。また、オブジェクト（またはクラス）が「継承」と呼ばれるものを知っているため、「オブジェクト」がキーワードです。これがパーティション化に使用されるものです。

私たちが話していることを見てみましょう。

通常のユーザーテーブルを作成します。

$ create table users ( id serial primary key, username text not null unique, password text, created_on timestamptz not null, last_logged_on timestamptz not null );

次に、図を完成させるために、数行と追加のインデックスを追加しましょう。

 $ insert into users (username, password, created_on, last_logged_on) select random_string( (random() * 4 + 5)::int4), random_string( 20 ), now() - '2 years'::interval * random(), now() - '2 years'::interval * random() from generate_series(1, 10000); $ create index newest_users on users (created_on);

したがって、テストテーブルがあります。

 $ \d Table "public.users" Column | Type | Modifiers ----------------+--------------------------+---------------------------------------------------- id | integer | not null default nextval('users_id_seq'::regclass) username | text | not null password | text | created_on | timestamp with time zone | not null last_logged_on | timestamp with time zone | not null Indexes: "users_pkey" PRIMARY KEY, btree (id) "users_username_key" UNIQUE CONSTRAINT, btree (username) "newest_users" btree (created_on)

いくつかのランダムデータ：

 $ select * from users limit 10; id | username | password | created_on | last_logged_on ----+----------+----------------------+-------------------------------+------------------------------- 1 | ityfce3 | 2ukgbflj_l2ndo3vilt2 | 2015-01-02 16:56:41.346113+01 | 2015-04-15 12:34:58.318913+02 2 | _xg_pv | u8hy20aifyblg9f3_rf2 | 2014-09-27 05:41:05.317313+02 | 2014-08-07 14:46:14.197313+02 3 | uvi1wo | h09ae85v_f_cx0gf6_8r | 2013-06-17 18:48:44.389313+02 | 2014-06-03 06:53:49.640513+02 4 | o6rgs | vzbrkwhnsucxco5pjep0 | 2015-01-30 11:33:25.150913+01 | 2013-11-05 07:18:47.730113+01 5 | nk61jw77 | lidk_mnpe_olffmod7ed | 2014-06-15 07:18:34.597313+02 | 2014-03-21 17:42:44.763713+01 6 | 3w326_2u | pyoqg87feemojhql7jrn | 2015-01-20 05:41:54.133313+01 | 2014-09-07 20:33:23.682113+02 7 | m9rk9mnx | 6pvt94s6ol46kn0yl62b | 2013-07-17 15:13:36.315713+02 | 2013-11-12 10:53:06.123713+01 8 | adk6c | egfp8re0z492e6ri8urz | 2014-07-23 11:41:11.883713+02 | 2013-10-22 07:19:36.200513+02 9 | rsyaedw | ond0tie9er92oqhmdj39 | 2015-05-11 16:45:40.472513+02 | 2013-08-31 17:29:18.910913+02 10 | prlobe46 | _3br5v97t2xngcd7xz4n | 2015-01-10 20:13:29.461313+01 | 2014-05-04 06:25:56.072513+02 (10 rows)

テーブルの準備ができたので、パーティションを作成できます。つまり、継承されたテーブルを意味します。

 $ create table users_1 () inherits (users); $ \d users_1 Table "public.users_1" Column | Type | Modifiers ----------------+--------------------------+---------------------------------------------------- id | integer | not null default nextval('users_id_seq'::regclass) username | text | not null password | text | created_on | timestamp with time zone | not null last_logged_on | timestamp with time zone | not null Inherits: users

したがって、特定の興味深いプロパティを持つ新しいテーブルがあります。

彼女はid列にメインテーブルと同じシーケンスを使用します。
非ヌル制約を含むすべての列の定義は同じです。
主キー、ユーザー名の一意性制限、created_onのインデックスはありません。

もう一度試してみましょうが、今回はより爆発的な効果があります。

 $ drop table users_1; $ create table users_1 ( like users including all ); $ \d users_1 Table "public.users_1" Column | Type | Modifiers ----------------+--------------------------+---------------------------------------------------- id | integer | not null default nextval('users_id_seq'::regclass) username | text | not null password | text | created_on | timestamp with time zone | not null last_logged_on | timestamp with time zone | not null Indexes: "users_1_pkey" PRIMARY KEY, btree (id) "users_1_username_key" UNIQUE CONSTRAINT, btree (username) "users_1_created_on_idx" btree (created_on)

これですべてのインデックスと制限ができましたが、継承情報は失われました。ただし、後で追加できます。

 $ alter table users_1 inherit users; $ \d users_1 Table "public.users_1" Column | Type | Modifiers ----------------+--------------------------+---------------------------------------------------- id | integer | not null default nextval('users_id_seq'::regclass) username | text | not null password | text | created_on | timestamp with time zone | not null last_logged_on | timestamp with time zone | not null Indexes: "users_1_pkey" PRIMARY KEY, btree (id) "users_1_username_key" UNIQUE CONSTRAINT, btree (username) "users_1_created_on_idx" btree (created_on) Inherits: users

1つのステップでそれを行うことができますが、その後、さまざまな不快な通知が表示されます。

 $ drop table users_1; $ create table users_1 ( like users including all ) inherits (users); NOTICE: merging column "id" with inherited definition NOTICE: merging column "username" with inherited definition NOTICE: merging column "password" with inherited definition NOTICE: merging column "created_on" with inherited definition NOTICE: merging column "last_logged_on" with inherited definition $ \d users_1 Table "public.users_1" Column | Type | Modifiers ----------------+--------------------------+---------------------------------------------------- id | integer | not null default nextval('users_id_seq'::regclass) username | text | not null password | text | created_on | timestamp with time zone | not null last_logged_on | timestamp with time zone | not null Indexes: "users_1_pkey" PRIMARY KEY, btree (id) "users_1_username_key" UNIQUE CONSTRAINT, btree (username) "users_1_created_on_idx" btree (created_on) Inherits: users

いずれにせよ、メインと最初のパーティションの2つのテーブルがあります。

ユーザーに対して何らかのアクション（フェッチ/更新/削除）を実行すると、両方のテーブルがスキャンされます。

 $ explain analyze select * from users where id = 123; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Append (cost=0.29..16.47 rows=2 width=66) (actual time=0.008..0.009 rows=1 loops=1) -> Index Scan using users_pkey on users (cost=0.29..8.30 rows=1 width=48) (actual time=0.008..0.008 rows=1 loops=1) Index Cond: (id = 123) -> Index Scan using users_1_pkey on users_1 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) Planning time: 0.327 ms Execution time: 0.031 ms (7 rows)

ただし、パーティションに直接アクセスすると、リクエストはそのパーティションでのみ実行されます。

 $ explain analyze select * from users_1 where id = 123; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Scan using users_1_pkey on users_1 (cost=0.15..8.17 rows=1 width=84) (actual time=0.002..0.002 rows=0 loops=1) Index Cond: (id = 123) Planning time: 0.162 ms Execution time: 0.022 ms (4 rows)

必要に応じて、ONLYキーワードを使用してパーティションなしでユーザーテーブルにのみアクセスできます。

 $ explain analyze select * from only users where id = 123; QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Index Scan using users_pkey on users (cost=0.29..8.30 rows=1 width=48) (actual time=0.008..0.008 rows=1 loops=1) Index Cond: (id = 123) Planning time: 0.229 ms Execution time: 0.031 ms (4 rows)

お気付きかもしれませんが、フェッチ/更新/削除はすべてのパーティションで機能すると言っています。インサートはどうですか？挿入はどこかにデータを追加する必要があるため、ONLYが使用されているかのように常に機能します。したがって、users_1に行を追加する必要がある場合、これを行う必要があります。

 INSERT INTO users_1 ...

どうやら見栄えはよくありませんが、心配しないでください。

実際のパーティションを作成してみましょう。まず、パーティションキーが何であるか、つまり、パーティションが選択されるアルゴリズムによって決定する必要があります。

最も明白なものがいくつかあります。

日付によるパーティション-たとえば、ユーザーが作成された年に基づいてパーティションを選択します。
さまざまな識別子によるパーティション-たとえば、最初の100万人のユーザー、200万人のユーザーなど。
何か別のパーティション-たとえば、ユーザー名の最初の文字によるパーティション。

「ユーザーに代わってハッシュパーティション」など、使用頻度の低い他のオプションがいくつかあります。

あるスキームを使用する価値があり、別のスキームを使用する価値がないのはなぜですか？それらの長所と短所を見てみましょう。

日付によるパーティション：
- 利点：
  - わかりやすい。
  - このテーブルの行数は非常に安定しています。
- 欠点：
  - サポートが必要-時々新しいパーティションを追加する必要があります。
  - ユーザー名またはIDで検索するには、すべてのパーティションをスキャンする必要があります。
IDによるパーティショニング：
- 利点：
  - わかりやすい。
  - パーティション内の行数は100％安定します。
- 欠点：
  - サポートが必要-時々新しいパーティションを追加する必要があります。
  - ユーザー名またはIDで検索するには、すべてのパーティションをスキャンする必要があります。
ユーザー名の最初の文字による分割：
- 利点：
  - わかりやすい。
  - サポートなし-厳密に定義されたパーティションのセットがあり、新しいパーティションを追加する必要はありません。
- 欠点：
  - パーティション内の行数は着実に増加します。
  - パーティションによっては、他のパーティションよりもかなり多くの行があります（「y *」よりも「t *」で始まるニックネームを持つ人が多い）。
  - IDで検索するには、すべてのパーティションをスキャンする必要があります。
ユーザー名のハッシュによる分割：
- 利点：
  - サポートなし-厳密に定義されたパーティションのセットがあり、新しいパーティションを追加する必要はありません。
  - 行はパーティション間で均等に分配されます。
- 欠点：
  - パーティション内の行数は着実に増加します。
  - IDで検索するには、すべてのパーティションをスキャンする必要があります。
  - ユーザー名による検索では、1つのパーティションのみがスキャンされますが、追加の条件が使用される場合のみです。

ハッシュ化されたユーザー名アプローチの最後の欠点は非常に興味深いものです。そこで何が起こるか見てみましょう。

最初に、さらにパーティションを作成する必要があります。

 $ create table users_2 ( like users including all ); $ alter table users_2 inherit users; ... $ create table users_10 ( like users including all ); $ alter table users_10 inherit users;

現在、usersテーブルには10個のパーティションがあります。

 $ \d users Table "public.users" Column | Type | Modifiers ----------------+--------------------------+---------------------------------------------------- id | integer | not null default nextval('users_id_seq'::regclass) username | text | not null password | text | created_on | timestamp with time zone | not null last_logged_on | timestamp with time zone | not null Indexes: "users_pkey" PRIMARY KEY, btree (id) "users_username_key" UNIQUE CONSTRAINT, btree (username) "newest_users" btree (created_on) Number of child tables: 10 (Use \d+ to list them.)

PostgreSQLにはconstraint_exclusionオプションがあります。また、「on」または「partition」に設定すると、PostgreSQLは一致する行を含むことができないパーティションをスキップします。

私のPgでは、これはデフォルトで設定されています：

 $ show constraint_exclusion; constraint_exclusion ---------------------- partition (1 row)

したがって、すべてのパーティションとベーステーブルには意味のある制限がないため、クエリはすべての11のテーブル（メインパーティションと10パーティション）を一度にスキャンします。

 $ explain analyze select * from users where id = 123; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- Append (cost=0.29..89.98 rows=11 width=81) (actual time=0.009..0.013 rows=1 loops=1) -> Index Scan using users_pkey on users (cost=0.29..8.30 rows=1 width=48) (actual time=0.007..0.007 rows=1 loops=1) Index Cond: (id = 123) -> Index Scan using users_1_pkey on users_1 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_2_pkey on users_2 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_3_pkey on users_3 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_4_pkey on users_4 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_5_pkey on users_5 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_6_pkey on users_6 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_7_pkey on users_7 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_8_pkey on users_8 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_9_pkey on users_9 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_10_pkey on users_10 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = 123) Planning time: 1.321 ms Execution time: 0.087 ms (25 rows)

これはあまり効果的ではありませんが、制限を設けることができます。

パーティションがIDによるパーティション化によって形成され、各パーティションに100,000個の識別子が含まれているとします。

いくつかの制限を追加できます。

 $ alter table users_1 add constraint partition_check check (id >= 0 and id < 100000); $ alter table users_2 add constraint partition_check check (id >= 100000 and id < 200000); ... $ alter table users_10 add constraint partition_check check (id >= 900000 and id < 1000000);

前のリクエストを繰り返します：

 $ explain analyze select * from users where id = 123; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Append (cost=0.29..16.47 rows=2 width=66) (actual time=0.008..0.009 rows=1 loops=1) -> Index Scan using users_pkey on users (cost=0.29..8.30 rows=1 width=48) (actual time=0.008..0.009 rows=1 loops=1) Index Cond: (id = 123) -> Index Scan using users_1_pkey on users_1 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = 123) Planning time: 1.104 ms Execution time: 0.031 ms (7 rows)

2つのテーブルのみをスキャンします。メイン（すべてのデータが現在あり、制限がないため、除外できません）と適切なパーティションです。

いいですね

このようなパーティション化条件は、ユーザー名またはcreated_onで簡単に追加できます。ただし、パーティションキーがより複雑な場合はどうなるかを見てください。

 $ alter table users_1 drop constraint partition_check, add constraint partition_check check (abs( hashtext(username) ) % 10 = 0); $ alter table users_2 drop constraint partition_check, add constraint partition_check check (abs( hashtext(username) ) % 10 = 1); ... $ alter table users_10 drop constraint partition_check, add constraint partition_check check (abs( hashtext(username) ) % 10 = 9);

知らない場合、hashtext（）は文字列を受け取り、-2147483648〜2147483647の範囲の整数を返します。

単純な算術のおかげで、abs（hashtext（string））％10は常に0..9の範囲の値を生成し、どのパラメーターについても簡単に計算できることを知っています。

PostgreSQLはこれを知っていますか？

 $ explain analyze select * from users where username = 'depesz'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Append (cost=0.29..89.98 rows=11 width=81) (actual time=0.023..0.023 rows=0 loops=1) -> Index Scan using users_username_key on users (cost=0.29..8.30 rows=1 width=48) (actual time=0.016..0.016 rows=0 loops=1) Index Cond: (username = 'depesz'::text) -> Index Scan using users_1_username_key on users_1 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (username = 'depesz'::text) -> Index Scan using users_2_username_key on users_2 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (username = 'depesz'::text) -> Index Scan using users_3_username_key on users_3 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (username = 'depesz'::text) -> Index Scan using users_4_username_key on users_4 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (username = 'depesz'::text) -> Index Scan using users_5_username_key on users_5 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (username = 'depesz'::text) -> Index Scan using users_6_username_key on users_6 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (username = 'depesz'::text) -> Index Scan using users_7_username_key on users_7 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (username = 'depesz'::text) -> Index Scan using users_8_username_key on users_8 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (username = 'depesz'::text) -> Index Scan using users_9_username_key on users_9 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (username = 'depesz'::text) -> Index Scan using users_10_username_key on users_10 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (username = 'depesz'::text) Planning time: 1.092 ms Execution time: 0.095 ms (25 rows)

いや知りません。基本的に、PostgreSQLは範囲（または同等）に基づいた検証のためにパーティションを自動的に除外することしかできません。機能に基づくものはありません。数字からの単純なモジュールでさえすでに多すぎます：

 $ alter table users_1 drop constraint partition_check, add constraint partition_check check ( id % 10 = 0); $ alter table users_2 drop constraint partition_check, add constraint partition_check check ( id % 10 = 1); ... $ alter table users_10 drop constraint partition_check, add constraint partition_check check ( id % 10 = 9); $ explain analyze select * from users where id = 123; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- Append (cost=0.29..89.98 rows=11 width=81) (actual time=0.009..0.016 rows=1 loops=1) -> Index Scan using users_pkey on users (cost=0.29..8.30 rows=1 width=48) (actual time=0.009..0.009 rows=1 loops=1) Index Cond: (id = 123) -> Index Scan using users_1_pkey on users_1 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_2_pkey on users_2 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_3_pkey on users_3 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_4_pkey on users_4 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_5_pkey on users_5 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_6_pkey on users_6 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_7_pkey on users_7 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_8_pkey on users_8 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_9_pkey on users_9 (cost=0.15..8.17 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = 123) -> Index Scan using users_10_pkey on users_10 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) Planning time: 0.973 ms Execution time: 0.086 ms (25 rows)

これは悲しいです。なぜなら、番号モジュールに基づいたパーティションキーには、（私の意見では）1つの大きな利点があります-安定したパーティション数です。大量のデータに達したときに再度パーティションを作成することを決定しない限り、将来それらを作成する必要はありません。

これは、複雑な（数字の関数またはモジュールに基づく）パーティションキーを使用できないことを意味しますか？いやこれらを使用できますが、クエリはより複雑になります。

 $ explain analyze select * from users where id = 123 and id % 10 = 123 % 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Append (cost=0.29..16.48 rows=2 width=66) (actual time=0.010..0.011 rows=1 loops=1) -> Index Scan using users_pkey on users (cost=0.29..8.31 rows=1 width=48) (actual time=0.010..0.010 rows=1 loops=1) Index Cond: (id = 123) Filter: ((id % 10) = 3) -> Index Scan using users_4_pkey on users_4 (cost=0.15..8.17 rows=1 width=84) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = 123) Filter: ((id % 10) = 3) Planning time: 1.018 ms Execution time: 0.033 ms (9 rows)

ここに、次のような別の条件を追加しました。

 id % 10 = 123 % 10

PostgreSQLは、式を解析するプロセスで書き換えることができます。

 id % 10 = 3

整数の％演算子が不変であることを知っているからです。そして今、クエリの一部として、正確なパーティションキー-id％10 = 3があります。したがって、Pgはパーティションキー（つまり、ベーステーブル）を持たないか、クエリに対応するキーを持つパーティションのみを使用できます。。

追加の合併症を導入するかどうかはあなた次第です。

リクエストを変更したくない場合、そして時々新しいパーティションを追加するのが難しくない場合は、私の以前の同僚であるキース・フィスケが書いたPGパーティションマネージャーに慣れる必要があります-これはパーティションを決定するために手動で実行する機能のセットですこれをクラウンで実行すると、将来のデータ用に新しいパーティションが作成されます。

挿入についてはすでに言及しましたが、パーティションに追加する必要がある挿入の問題を回避する方法については説明しませんでした。

一般に、これはトリガージョブです。中国のPg_partmanがこれらのトリガーを作成しますが、何が起こっているのかを理解し、pg_partmanをブラックボックスとしてではなく、退屈な作業を行う補助ツールとして使用してください。

現在、私のパーティション構成は数値ベースのモジュールに基づいています（私が知る限り、partmanはこれを行うことができません）。適切なトリガー関数を作成しましょう。データがusersテーブルに挿入されたときに呼び出され、エラーなしで適切なパーティションに挿入をリダイレクトする必要があります。したがって、次のように記述します。

 $ create function partition_for_users() returns trigger as $$ DECLARE v_parition_name text; BEGIN v_parition_name := format( 'users_%s', 1 + NEW.id % 10 ); execute 'INSERT INTO ' || v_parition_name || ' VALUES ( ($1).* )' USING NEW; return NULL; END; $$ language plpgsql;

そして今、トリガーの定義：

 $ create trigger partition_users before insert on users for each row execute procedure partition_for_users();

行を追加してみましょう。

 $ insert into users (username, password, created_on, last_logged_on) values ( 'depesz', random_string( 20 ), now() - '2 years'::interval * random(), now() - '2 years'::interval * random() ); $ select currval('users_id_seq'); currval --------- 10003 (1 row)

データが表示されるかどうか見てみましょう：

 $ select * from users where username = 'depesz'; id | username | password | created_on | last_logged_on -------+----------+----------------------+-------------------------------+------------------------------- 10003 | depesz | bp7zwy8k3t3a37chf1hf | 2014-10-24 02:45:51.398824+02 | 2015-02-05 18:24:57.072424+01 (1 row)

よさそうですが、どこにありますか？メインテーブルに？

 $ select * from only users where username = 'depesz'; id | username | password | created_on | last_logged_on ----+----------+----------+------------+---------------- (0 rows)

いやおそらく適切なパーティションにあるのでしょうか？

 $ select * from users_4 where username = 'depesz'; id | username | password | created_on | last_logged_on -------+----------+----------------------+-------------------------------+------------------------------- 10003 | depesz | bp7zwy8k3t3a37chf1hf | 2014-10-24 02:45:51.398824+02 | 2015-02-05 18:24:57.072424+01

はいトリガーは機能しました。しかし、この方法には1つの欠点があります。つまり、「RETURNING」は機能しません。

 $ insert into users (username, password, created_on, last_logged_on) values ( 'test', random_string( 20 ), now() - '2 years'::interval * random(), now() - '2 years'::interval * random() ) returning *; id | username | password | created_on | last_logged_on ----+----------+----------+------------+---------------- (0 rows)

これは、パフォーマーの観点から、挿入が何も返さなかったため、トリガーがNULLを返したために発生します。

この問題の解決策を見つけることができませんでした。私の場合、nextval（）を使用して事前に初期キー値を取得してから、完成した値を挿入することを好みます-挿入後に既に存在するように：

 $ select nextval('users_id_seq'); nextval --------- 10005 (1 row) $ insert into users (id, username, password, created_on, last_logged_on) values ( 10005, 'test', random_string( 20 ), now() - '2 years'::interval * random(), now() - '2 years'::interval * random() );

これには1つの説明があります。別の「挿入」の各ラインPGが必要である実行されますので、トリガーを介してすべてのインサートをルーティングすることは、それらが遅くなります。

緊急のバルク挿入のために最善の解決策は、パーティションを直接操作するためにそれらを取得することである。したがって、例えば、代わりに

 COPY users FROM stdin; .... \.

最初に、たとえば次のようにして、必要な識別子の数を把握します。

 select nextval('users_id_seq') from generate_series(1, 100);

そして、適切なものを発行します。

 COPY users_p1 FROM stdin; .... \. COPY users_p2 FROM stdin; .... \. ...

最も便利な方法ではありませんが、大量のデータをパーティションテーブルにインポートする場合に便利です。

そのため、今度はパーティション化とは何か、どのように機能するかを理解する必要があります。タイトルの次の質問は：なぜですか？

答えるのは比較的簡単です。生産性を確保するか、メンテナンスを簡素化します。

簡単な例として、10億行（1,000,000,000）のusersテーブルを取り上げます。

インデックスの深さが増すだけで、インデックスを考慮しても、その中の検索は徐々に価格が上昇します。

これは私の小さなテストパターンでも見ることができます。

すべてのパーティションとパーティショントリガーをリセットしましょう。

 $ drop table users_1; $ drop table users_2; ... $ drop table users_10; $ drop trigger partition_users on users;

現在、ユーザーテーブルには10,000行あります。ユーザー名による単純な検索には0.020ミリ秒かかります。これは3回の試行のうちで最適な時間です。

行を追加する場合：

 $ insert into users (username, password, created_on, last_logged_on) select random_string( (random() * 4 + 5)::int4), random_string( 20 ), now() - '2 years'::interval * random(), now() - '2 years'::interval * random() from generate_series(1, 100000);

同じ検索には0.025msかかります。検索時間を0.005ms増やすことは小さいかもしれませんが、まだ110,000行しかないため、システムには他のテーブルがないため、インデックス付きのテーブル全体がメモリに格納されます。

もちろん、パーティション分割には意味があります。たとえば、通常ユーザー名で検索する場合、IDでパーティションを作成しても意味がありません-すべてのパーティションでPgを検索する必要があります（これは将来的に意味がありますが、この記事の最後で説明します）。

つまり、通常は何を尋ねるかを決定する必要があります。それが何らかのキーによる検索なのか、おそらく最新のデータのみを表示するのか、ということです。そして、Pgがスキャンする必要があるパーティションの数を制限するような方法でパーティションを作成します。

重要なことは、特にプログラマよりもデータベース管理者である場合は、パーティション化により作業が楽になることです。メンテナンスタスク（インデックスの作成、バキューム、pg_reorg / pg_repack、pg_dump）は、パーティションと同じ数のサブタスクに効果的に分割できます。したがって、大きなテーブルを再パッケージ化するための1時間のトランザクションの代わりに、20の高速化と少ないディスクスペーストランザクションの使用が可能になり、結果は一般に同じになります！

もちろん、ビジネスは良いニュースに限定されません。パーティション化には1つの大きな欠点があります。パーティション化されたテーブルを指す外部キーを持つことはできません。

うまくいきません。パーティションを直接指す外部キーを取得することもできますが、これは（通常）無意味です。

これが個人にとって大きな問題であるかどうかは、ユーザーケースによって異なります。ほとんどの場合、パーティショニングが正当化されるほど十分に大きなテーブルに到達すると、アプリケーションは十分にテストされ、外部キーの欠如を受け入れることができるように思えます。さらに、「悪い」値の存在をテストするタスクをいつでもクラウンに追加できます。

これで、パーティショニングとは何か、どのように機能し、なぜ使用されるかがわかりました。最後の疑問は、テーブルをパーティション分割されたテーブルに変換する方法です。通常、アプリケーションはパーティションテーブルで作成されません-最初はこれは意味がありません。しかし、すぐに、多くの行を含む何らかの種類のテーブルが作成され、「作成時にすぐにパーティションを作成する必要がありました」と思うようになります。

ただし、アプリケーションが既に実行されている場合でも、パーティションを分割できますか？最小限の問題で？

見てみましょう。テストのために、97 GBのpgbenchデータベースを作成しました。そのほとんど、83 GBは、666,600,000のエントリを含むpgbench_accountsテーブルにあります。

このテーブルのスキームは次のとおりです。

  Table "public.pgbench_accounts" Column | Type | Modifiers ----------+---------------+----------- aid | integer | not null bid | integer | abalance | integer | filler | character(84) | Indexes: "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)

そして、それに対するすべての要求は、1から666,600,000の値を含む援助列に基づいています。

それでは、援助値の範囲に基づいて分割しましょう。

各パーティションに1,000万行を配置すると、67個のパーティションが必要になります。

しかし、自分のアクションが作業を中断しないことをどのように確認できますか？とても簡単です。pgbenchをループで実行します。正確な速度レポートには興味がありません; pgbenchの動作に私の作業がどの程度影響するかについての十分な情報があります。

これらの考えで、私は関数を実行しました：

 $ while true do date pgbench -T 10 -c 2 bench done 2>&1 | tee pgbench.log

彼女は10秒のテストを実行し、統計をファイルに保存するので、後で結果とパーティション分割作業の関係を追跡できます。

すべての準備ができたら、適切な場所にチェックを入れてパーティションを作成します。

 do $$ declare i int4; aid_min INT4; aid_max INT4; begin for i in 1..67 loop aid_min := (i - 1) * 10000000 + 1; aid_max := i * 10000000; execute format('CREATE TABLE pgbench_accounts_p_%s ( like pgbench_accounts including all )', i ); execute format('ALTER TABLE pgbench_accounts_p_%s inherit pgbench_accounts', i); execute format('ALTER TABLE pgbench_accounts_p_%s add constraint partitioning_check check ( aid >= %s AND aid <= %s )', i, aid_min, aid_max ); end loop; end; $$;

パーティションの準備ができており、チェックが使用されていることを確認できます。

 $ explain analyze select * from pgbench_accounts where aid = 123; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Append (cost=0.57..16.75 rows=2 width=224) (actual time=6.468..6.473 rows=1 loops=1) -> Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.57..8.59 rows=1 width=97) (actual time=6.468..6.469 rows=1 loops=1) Index Cond: (aid = 123) -> Index Scan using pgbench_accounts_p_1_pkey on pgbench_accounts_p_1 (cost=0.14..8.16 rows=1 width=352) (actual time=0.004..0.004 rows=0 loops=1) Index Cond: (aid = 123) Planning time: 3.475 ms Execution time: 6.497 ms (7 rows)

次に、トリガー「ルーター」を追加する必要があります。

 $ create function partition_for_accounts() returns trigger as $$ DECLARE v_parition_name text; BEGIN v_parition_name := format( 'pgbench_accounts_p_%s', 1 + ( NEW.aid - 1 ) / 10000000 ); execute 'INSERT INTO ' || v_parition_name || ' VALUES ( ($1).* )' USING NEW; return NULL; END; $$ language plpgsql; $ create trigger partition_users before insert on pgbench_accounts for each row execute procedure partition_for_accounts();

これはすべて素晴らしいことですが、新しく挿入された行に対してのみ機能し、ソーステーブルには既に6億6600万行あります。それについてどうすればいいですか？

それらを移動する必要があります。これは理論的には比較的単純ですが、いくつかの落とし穴があります。

どのような場合でも、どのトランザクションでも（つまり、メインテーブルとパーティションから）両方の行が同時に表示されることはありません。
すべての行を削除してパーティションに挿入することはできません。これにより、移動中にベーステーブル全体がロックされるためです。

2番目の問題は、データロットを操作することで軽減できます。ただし、これにはSQLを使用できません。

時々、誰かが大規模な操作をチャンクに分割し、データのチャンクを繰り返し処理する単一のsql関数で呼び出す方法を尋ねます。このアプローチには1つの基本的な問題があります。関数呼び出しはトランザクションです。したがって、この関数が実行するすべては、単一のトランザクションで発生します。したがって、ロックの問題は解決されません。

ただし、psqlを使用して（またはruby、perl、python-関係ありません）、各部分で少数の行のみを移動し、メインテーブルを短時間ロックします。

一般に、単一のリクエストは次のようになります。

 with x as (delete from only pgbench_accounts where aid between .. and .. returning *) insert into appropriate_partition select * from x;

サービングサイズ1000を選択しました。これは、プロセスが引きずられないように十分小さく、サービングの総数が過度にならないように十分に大きい（666千）です。

それでは、バッチファイルを作成しましょう。

 \pset format unaligned \pset tuples_only true \o /tmp/run.batch.migration.sql SELECT format( 'with x as (DELETE FROM ONLY pgbench_accounts WHERE aid >= %s AND aid <= %s returning *) INSERT INTO pgbench_accounts_p_%s SELECT * FROM x;', i, i + 999, ( i - 1 ) / 10000000 + 1 ) FROM generate_series( 1, 666600000, 1000 ) i; \o

これをpsqlで実行すると、彼は/tmp/run.batch.migration.sqlファイルを作成しました。これは、次のような666,600のクエリが含まれているため、非常に大きい（97 GB）です。

 with x as (DELETE FROM ONLY pgbench_accounts WHERE aid >= 1 AND aid <= 1000 returning *) INSERT INTO pgbench_accounts_p_1 SELECT * FROM x; with x as (DELETE FROM ONLY pgbench_accounts WHERE aid >= 1001 AND aid <= 2000 returning *) INSERT INTO pgbench_accounts_p_1 SELECT * FROM x; with x as (DELETE FROM ONLY pgbench_accounts WHERE aid >= 2001 AND aid <= 3000 returning *) INSERT INTO pgbench_accounts_p_1 SELECT * FROM x;

すべての準備が整ったので、プロセスを開始できます（もちろん、「screen」または「tmux」を使用して、サーバーへのssh接続が切断されても何も失われないようにします）。

 $ psql -d bench -f /tmp/run.batch.migration.sql

少し時間がかかります。テストデータベースの場合、平均パケットは約92ミリ秒で処理されます。つまり、データを先に進めるのに17時間必要です。

現実には、たった7時間しかかかりませんでした。悪くない。

最後に、pgbench_accountsテーブルの重量は約83GBです（pgbench、relocation、vacuumを処理するのに十分な速度のディスクではないと思います）。

しかし、私はチェックし、すべての行がパーティションに移動したように見えます：

 $ select count(*) from only pgbench_accounts; count ------- 0 (1 row)

移動プロセス中のpgbenchの速度はどうですか？

4つのフェーズがありました。

仕事を移す前。
パーティションを作成した後。
トリガーを作成した後。
移動中。

結果は？

  phase | min | avg | max ---------+-----------+------------------+----------- Phase 1 | 28.662223 | 64.0359512839506 | 87.219148 Phase 2 | 21.147816 | 56.2721418360656 | 75.967217 Phase 3 | 23.868018 | 58.6375074477612 | 75.335558 Phase 4 | 5.222364 | 23.6086916565574 | 65.770852 (4 rows)

はい、移動するとすべてが遅くなりました。しかし、これはSSDではなくSATAディスクを備えた通常のパーソナルコンピュータであり、常に高負荷下にあることに注意してください-pgbenchはできるだけ速くリクエストを実行しました。

さらに、バキュームが削除を適切に処理できないため、わずかな速度低下が発生しました。私の意見では、結果は絶対に受け入れられます。

最後に、私は次のことができました：

 $ truncate only pgbench_accounts;

そして、すべてが正常かどうかを確認するには：

 $ select count(*) from pgbench_accounts; count ----------- 666600000 (1 row)

これらはすべてエラーなしで、「実際のアプリケーション」の作業を中断することなく行われました。

最後に、パーティション分割が（比較的）すぐにさらにクールになることを追加します。最近、パーティションを異なるサーバーに保存できます。そして現在、並列スキャンを有効にするための作業が進行中です（ただし、この更新がバージョン9.6より前に表示されることはほとんどありません）。これにより、プロセス全体が大幅に改善されます。

このテキストがお役に立てば幸いです。

PostgreSQLでのテーブルパーティション分割のその他の側面について説明しますか？PG Day'16 Russiaカンファレンスのレポートのプログラムに、あなたにとって最も興味深いトピックを追加できることを嬉しく思います！アーリーバードチケットの販売を既に開始しています急いで最低価格で登録しましょう！

PostgreSQLでのパーティション分割-何？ なんで？ どうやって？

More articles:

PostgreSQLでのパーティション分割-何？なんで？どうやって？