#PLpgSQL
Explore tagged Tumblr posts
info-comp · 1 year ago
Text
В данном материале рассмотрен пример реализации DDL лога в PostgreSQL с использованием триггеров событий.
0 notes
thedbahub · 1 year ago
Text
How to Convert a SQL Server Stored Procedure to PostgreSQL
Are you migrating a database from SQL Server to PostgreSQL? One key task is converting your stored procedures from T-SQL to PL/pgSQL, PostgreSQL’s procedural language. In this article, you’ll learn a step-by-step process for translating a SQL Server stored procedure to its PostgreSQL equivalent. By the end, you’ll be able to confidently port your T-SQL code to run on Postgres. Understand the…
View On WordPress
0 notes
kennak · 2 years ago
Quote
レッツゴー!! Supabase は私が必要とするものを提供し続けています。 SQL/データベースに対するツールの貧弱さにはいつも驚かされます。 特に、plpgsql をサポートする適切なフォーマッタが存在せず、クエリを完全に破壊しない (場合によっては実際にクエリを破壊することもあります)。 atm の最良のオプションは imo TablePlus と DataGrip です。vscode で多くのオプションを試しましたが、便利なツールはありますが、実際にはすべてが揃っていません。 (また、この発売週に他に何をリリースするかを見るのにも興奮しています!)
Postgres 言語サーバー | ハッカーニュース
0 notes
skidlv · 5 years ago
Text
Телеграм бот на Postgresql часть 8
Рекурсия автоматизации
Продолжаем предыдущую статью.
Примеры из предыдущих статей требовали ручной запуск каждого процесса. Теперь попробуем завершить автоматизацию процесса некоторым циклом.
Создадим новую функцию:
CREATE OR REPLACE FUNCTION public.next_loop( param jsonb DEFAULT '{}'::jsonb)     RETURNS jsonb     LANGUAGE 'plpgsql'
    COST 100     VOLATILE
AS $BODY$DECLARE u text DEFAULT random()::text; j jsonb DEFAULT param; BEGIN PERFORM dblink.dblink_connect(u, 'host=/run/postgresql dbname=<dbname> user=<user> password=<password>'); PERFORM pg_advisory_unlock(100); PERFORM dblink.dblink_send_query(u, format('SELECT do_loop(%L)', j)); PERFORM dblink.dblink_disconnect(u); RETURN j; END$BODY$;
Эта функция использует расширение dblink и его нужно установить в нашу базу. Ещё в первой статье говорилось о процедурном языке PL/Python и ссылка на документацию. В документации было описание по его установке. Похожим образом устанавливается и расширение dblink. Однако в примере я использую отдельную схему с одноимённым названием расширения для удобства программирования. В этой функции ещё есть параметры <dbname>, <user> и <password>. Вы их укажете самостоятельно в соответствии с настройками вашей базы данных.
Приведу пример как двумя запросами создать схему и подключить расширение:
CREATE SCHEMA IF NOT EXISTS dblink; CREATE EXTENSION IF NOT EXISTS dblink SCHEMA dblink;
Теперь расширение и его функции будут находится в отдельной схеме. Таким образом не перемешиваясь с нашими рабочими функциями.
Создадим ещё одну функцию:
CREATE OR REPLACE FUNCTION public.do_loop( param jsonb DEFAULT '{}'::jsonb)     RETURNS jsonb     LANGUAGE 'plpgsql'
    COST 100     VOLATILE
AS $BODY$DECLARE l bool DEFAULT pg_try_advisory_lock(100); j jsonb DEFAULT param; BEGIN IF l THEN PERFORM get_updates(); PERFORM do_start_message(); SELECT next_loop(j) INTO j; END IF; RETURN j; END$BODY$;
На первый взгляд всё усложняется. Теперь мы уже создаём сразу больше одной функции. Но на самом деле просто разбили на две для удобства понимания.
Немного разберём эти две функции и их алгоритмы. Начнём с последней и опишем последовательность работы. Передачу параметров я поставил здесь только для примера. Параметры не используются.
В начале функции сразу используется триггер базы Postgresql. Который поможет предотвратить коллизию выполнения цикла. Внутри самой функции проверка на успешное включение триггера.
При успешном включении триггера сработают последовательно три функции. Первые две мы рассматривали в предыдущих статьях. А последнюю создали первой в этой статье. Вот последняя вызванная функция и зацикливает процесс.
Наша первая функция из этой статьи выполняет следующие действия. Она создаёт коннект к текущей базе. После чего снимает установленный во второй функции триггер. А дальше выполняет нашу вторую функцию асинхронно. Параметр как я писал выше просто для примера и не несёт никакого функционала. В завершении выполняется отсоединение от базы и завершение функции. Вот таким образом создаётся бесконечный цикл с минимальной нагрузкой на базу.
Протестировать работу можно выполнив запрос:
SELECT public.do_loop()
Теперь ваш бот будет постоянно в цикле реагировать на каждую новую команду /start. Также выполнение этой функции можно делать при запуске сервера. Но это отдельная тема и эту информацию можно почерпнуть из документации.
Для остановки нашего цикла просто выполните запрос:
SELECT pg_advisory_lock(100)
После выполнения этого запроса бесконечный цикл остановится.
Вот собственно основные шаги для создания Telegram бота созданного при помощи базы данных Postgresql.
В следующих статьях мы сможем рассмотреть более продвинутые алгоритмы для автоматизации бота.
Завершаю статью и напомню о рабочем сервере который использую по этому адресу. Социальные страницы для новостей и комментариев: Facebook, Вконтакте, Telegram. Обязательно пишите ваши комментарии и подписывайтесь на мои страницы. Смело за��авайте ваши вопросы на которые я буду отвечать в следующих статьях. Приглашаю вас заглянуть на страничку проекта пассивного дохода участником которого я являюсь.
1 note · View note
longflicks · 3 years ago
Text
Psequel gui windows
Tumblr media
PSEQUEL GUI WINDOWS FOR MAC OS
PSEQUEL GUI WINDOWS INSTALL
PSEQUEL GUI WINDOWS UPDATE
PSEQUEL GUI WINDOWS UPDATE
So just type your password and press ENTER/RETURN key. My Stable Diffusion GUI update 1.3.0 is out now Includes optimizedSD code, upscaling and face restoration, seamless mode, and a ton of fixes. In the good old MySQL world, my favorite client is Sequel Pro, but its support for PostgreSQL doesn't seem to. However, they are either web-based, Java-based or don't support the features I want. I know there is a list of PostgreSQL GUI Tools. However, I found its UI is clumsy and complicated. Free Administration Centre for the PostgreSQL database. It's also one of the few clients to provide a GUI front end to the plpgsql debugger. It provides a SQL query tool, an editor for procedural languages and a CRUD interface. When you type the password, it won't be displayed on screen, but the system would accept it. Well, pgAdmin is great for its feature-richness. For many years the 'standard' freely available GUI client for Postgresql, and so is bundled in many packaged installers. If the screen prompts you to enter a password, please enter your Mac's user password to continue. No, PSequel is written from scratch in Swift 2, although PSequel's UI is highly inspired by Sequel Pro. This meant that everyone had to start pulling up Rethink docs, and learning the query syntax to update/delete/etc records, slowing down the efforts of front-end developers that otherwise don't need to know the query language.
PSEQUEL GUI WINDOWS INSTALL
Ruby -e '$(curl -fsSL )' /dev/null brew install caskroom/cask/brew-cask 2> /dev/null PSequel is a Swift based standalone OS X client application that provides a simple and straightforward PostgreSQL GUI designed to help you perform a number of basic operations using a. RethinkDB didn't have an admin GUI, like Robomongo and Psequel, that we had been accustomed to using on other projects. Press Command+Space and type Terminal and press enter/return key.App description: sequel-pro (App: Sequel Pro.app).The most popular Linux alternative is DBeaver, which is both free and Open Source.If that doesn't suit you, our users have ranked 32 alternatives to PSequel and 15 are available for Linux so hopefully you can find a suitable replacement. can we have a link to download the full server ISO not the core. i think the link MS provided is server core. it doesn't even have the GUI install interface when you install it 'Server Graphical Shell'. it's not the first time i Install Windows. OS on the hardware of other manufacturers - unlike Windows which is developed by Microsoft. there is no such options, only standard or data center.
PSEQUEL GUI WINDOWS FOR MAC OS
PSequel is not available for Linux but there are plenty of alternatives that runs on Linux with similar functionality. Just found this: PSequel, a PostgreSQL GUI Tool for Mac OS X. PSequel provides a clean and simple interface to perform common PostgreSQL tasks quickly. For all the Postgres fans, here is a nice looking tool for Mac OS X: Designed for Yosemite. PSequel – PostgreSQL GUI tool for Mac OS X. Modern, native client with intuitive GUI tools to create, access, query & edit multiple databases: MySQL, PostgreSQL, SQLite, Microsoft SQL Server.
Tumblr media
0 notes
globalmediacampaign · 4 years ago
Text
Amazon Aurora PostgreSQL parameters, Part 4: ANSI compatibility options
Organizations today have a strategy to migrate from traditional databases and as they plan their migration, they don’t want to compromise on performance, availability, and security features. Amazon Aurora is a cloud native relational database service that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. The PostgreSQL-compatible edition of Aurora delivers up to 3X the throughput of standard PostgreSQL running on the same hardware, enabling existing PostgreSQL applications and tools to run without requiring modification. The combination of PostgreSQL compatibility with Aurora enterprise database capabilities provides an ideal target for commercial database migrations. Aurora PostgreSQL has enhancements at the engine level which improves the performance for high concurrent OLTP workload, and also helps bridge the feature gap between commercial engines and open-source engines. While the default parameter settings for Aurora PostgreSQL are good for most of the workloads, customers who migrate their workloads from commercial engines may need to tune some of the parameters according to performance and other non-functional requirements. Even for workloads which are migrated from PostgreSQL to Aurora PostgreSQL, we may need to relook at some of the parameter settings because of architectural differences and engine level optimizations. In this four part series, we explain parameters specific to Aurora PostgreSQL. We also delve into certain PostgreSQL database parameters that apply to Aurora PostgreSQL, how they behave differently, and how to set these parameters to leverage or control additional features in Aurora PostgreSQL. In part one of this series, we discussed the instance memory-related parameters and Query Plan Management parameters that can help you tune Amazon Aurora PostgreSQL. In part two, we discussed parameters related to replication, security, and logging. We covered Aurora PostgreSQL optimizer parameters in part three which can improve performance of queries. In this part, we will cover parameters which can align Aurora PostgreSQL closer to American National Standards Institute (ANSI) standards and reduce the migration effort when migrating from commercial engines. The ANSI has approved committees of standards developing organizations that publish best practices and standards for database query languages. Most vendors modify SQL to meet their needs and generally base their programs off the current version of this standard. The international standard (now ISO/IEC 9075) has been revised periodically ever since the first in 1986 and most recently in 2016. The PostgreSQL community tries to maintain compatibility with ANSI SQL. But some PostgreSQL behaviors don’t exactly comply with ANSI specifications. In other cases, although PostgreSQL complies with ANSI specifications, the syntax accepted by PostgreSQL is slightly different from commercial engines. Several customers, especially ISVs, strive to keep their code ANSI compatible so as to allow for DB engine independence for their product or offering. Amazon Aurora PostgreSQL adds additional capabilities that can be helpful for retaining behavior when migrating from other databases, such as Oracle or Microsoft SQL Server. These features were introduced in Aurora PostgreSQL 3.1.0 (compatible with PostgreSQL 11.6) and Aurora PostgreSQL 2.4.0 (compatible with PostgreSQL v10.11) and are controlled by additional parameters. These features are also available in newer major versions release such as Aurora PostgreSQL 4.x (compatible with PostgreSQL 12). In this post, we cover parameters that control these compatibility behaviors in Aurora PostgreSQL. ansi_constraint_trigger_ordering This parameter controls whether Aurora PostgreSQL retains PostgreSQL behavior or complies with ANSI specifications regarding the run order for user-defined trigger actions and triggers defined for internal constraints. Switching it off reverts back to PostgreSQL behavior, meaning triggers run in alphabetical order. As per the pg_settings catalog, this parameter is described as the following code: pgtraining=> select short_desc, extra_desc from pg_settings where name='ansi_constraint_trigger_ordering'; -[ RECORD 1 ]------------------------------------------------------------------------------------------------------------------ short_desc | Change the firing order of constraint triggers to be compatible with the ANSI SQL standard. extra_desc | When turned on, the firing order of constraint triggers in the after trigger queue is modified to be compatible with the ANSI SQL standard, while to the extent possible, while not changing the semantics of PostgreSQL applications that would have complied with the SQL standard without the parameter being turned on. Internal constraint triggers are fired first in alphabetical order, followed by user-defined constraint triggers in alphabetical order, followed by firing any other triggers, in alphabetical order. Let’s understand how this parameter affects your query behavior. PostgreSQL behavior In PostgreSQL, if two triggers have the same firing criteria (such as AFTER INSERT FOR EVERY ROW or BEFORE DELETE FOR EVERY ROW), the run order is decided based on their alphabetical order (pure ASCII sorting). The triggers with a name in uppercase are fired first, and followed by the triggers in lowercase in alphabetical order. This can be useful if you want to control the order of your triggers. For example, I can define two before insert triggers on a table pgbench_branches: pgbench_branches_trig_B_I_R_001 and pgbench_branches_trig_B_I_R_010. In this case, the trigger pgbench_branches_trig_B_I_R_001 fires before pgbench_branches_trig_B_I_R_010. ANSI standards SQL ANSI standards require that triggers be fired in the order in which they’re created. Although this makes sense, it adds an additional responsibility on programmers to drop and create all the triggers whenever introducing a new trigger. To make things easier, some of the engines implement an additional feature so you can specify an additional property—ORDER—while defining the trigger. PostgreSQL takes a different approach; it’s not hard to emulate what other engines offer with additional syntax by following a naming convention as we discussed. Internal triggers PostgreSQL implements internal and user-defined constraints as triggers. For example, even though we don’t define any trigger on pgbench_tellers, internal triggers are defined because of a referential integrity constraint (commonly referred to as a foreign key). Let’s look at the triggers currently defined on pgbench_tellers , which has four internal triggers. The internal triggers are defined to trigger an action or check whenever we use data modification language (DML) on pgbench_tellers: pgtraining=> select tgrelid::regclass trigger_table, tgname trigger_name, tgfoid::regproc trigger_function, tgisinternal is_trigger_internal, tgconstrrelid::regclass parent_table, tgconstraint::regclass, tginitdeferred is_constraint_trigger_initially_deferred, tgdeferrable is_constraint_trigger_deferrable from pg_trigger where tgrelid::regclass::text='pgbench_tellers'; trigger_table | trigger_name | trigger_function | is_trigger_internal | parent_table | tgconstraint | is_constraint_trigger_initially_deferred | is_cons traint_trigger_deferrable -----------------+------------------------------+------------------------+---------------------+------------------+--------------+------------------------------------------+-------- -------------------------- pgbench_tellers | RI_ConstraintTrigger_c_22663 | "RI_FKey_check_ins" | t | pgbench_branches | 22660 | f | f pgbench_tellers | RI_ConstraintTrigger_c_22664 | "RI_FKey_check_upd" | t | pgbench_branches | 22660 | f | f pgbench_tellers | RI_ConstraintTrigger_a_22676 | "RI_FKey_noaction_del" | t | pgbench_history | 22675 | f | f pgbench_tellers | RI_ConstraintTrigger_a_22677 | "RI_FKey_noaction_upd" | t | pgbench_history | 22675 | f | f (4 rows) Now let’s look at triggers defined on pgbench_branches. Triggers are defined to cascade or restrict an action to the child table when a DML fires on pgbench_branches. A set of triggers is defined for each parent or child table—pgbench_tellers and pgbench_accounts: pgtraining=> select tgrelid::regclass trigger_table, tgname trigger_name, tgfoid::regproc trigger_function, tgisinternal is_trigger_internal, tgconstrrelid::regclass parent_table, tgconstraint::regclass, tginitdeferred is_constraint_trigger_initially_deferred, tgdeferrable is_constraint_trigger_deferrable from pg_trigger where tgrelid::regclass::text='pgbench_branches'; trigger_table | trigger_name | trigger_function | is_trigger_internal | parent_table | tgconstraint | is_constraint_trigger_initially_deferred | is_con straint_trigger_deferrable ------------------+------------------------------+------------------------+ pgbench_branches | RI_ConstraintTrigger_a_22661 | "RI_FKey_noaction_del" | t | pgbench_tellers | 22660 | f | f pgbench_branches | RI_ConstraintTrigger_a_22662 | "RI_FKey_noaction_upd" | t | pgbench_tellers | 22660 | f | f pgbench_branches | RI_ConstraintTrigger_a_22666 | "RI_FKey_noaction_del" | t | pgbench_accounts | 22665 | f | f pgbench_branches | RI_ConstraintTrigger_a_22667 | "RI_FKey_noaction_upd" | t | pgbench_accounts | 22665 | f | f pgbench_branches | RI_ConstraintTrigger_a_22671 | "RI_FKey_noaction_del" | t | pgbench_history | 22670 | f | f pgbench_branches | RI_ConstraintTrigger_a_22672 | "RI_FKey_noaction_upd" | t | pgbench_history | 22670 | f | f (6 rows) Let’s add a new AFTER TRIGGER on pgbench_teller, which adds a row to pgbench_branches by selecting a row from pgbench_teller. We’re not using the new variable here to insert an incoming row. Instead, we’re getting a row that exists in pgbench_teller but the corresponding branch doesn’t exist in pgbench_branches: create or replace function fix_pgbench_branches() returns trigger as $body$ begin insert into public.pgbench_branches (bid,bbalance,filler) select t.bid, 0, 'the presence of this row in pgbench_tellers (child table) violates the SQL standard' from public.pgbench_tellers t where t.bid not in (select t1.bid from public.pgbench_branches t1); return NULL; end; $body$ language plpgsql; create trigger "teller_trig" after insert on public.pgbench_tellers for each row execute procedure fix_pgbench_branches(); Let’s look at the triggers currently defined on pgbench_tellers. In addition to the four internal triggers we saw earlier, the code contains a user-defined trigger (the one we just created): pgtraining=> select tgrelid::regclass trigger_table, tgname trigger_name, tgfoid::regproc trigger_function, tgisinternal is_trigger_internal, tgconstrrelid::regclass parent_table, tgconstraint::regclass, tginitdeferred is_constraint_trigger_initially_deferred, tgdeferrable is_constraint_trigger_deferrable from pg_trigger where tgrelid::regclass::text='pgbench_tellers'; trigger_table | trigger_name | trigger_function | is_trigger_internal | parent_table | tgconstraint | is_constraint_trigger_initially_deferred | is_cons traint_trigger_deferrable -----------------+------------------------------+------------------------+---------------------+------------------+--------------+------------------------------------------+-------- -------------------------- pgbench_tellers | RI_ConstraintTrigger_c_22663 | "RI_FKey_check_ins" | t | pgbench_branches | 22660 | f | f pgbench_tellers | RI_ConstraintTrigger_c_22664 | "RI_FKey_check_upd" | t | pgbench_branches | 22660 | f | f pgbench_tellers | RI_ConstraintTrigger_a_22676 | "RI_FKey_noaction_del" | t | pgbench_history | 22675 | f | f pgbench_tellers | RI_ConstraintTrigger_a_22677 | "RI_FKey_noaction_upd" | t | pgbench_history | 22675 | f | f pgbench_tellers | teller_trig | fix_pgbench_branches | f | - | - | f | f (5 rows) Let’s insert a row in pgbench_tellers where the bid does not yet exist in pgbench_branches: insert into public.pgbench_tellers (tid, bid, tbalance) values ( ( select max(tid)+1 from public.pgbench_tellers ), ( select max(bid)+1 from public.pgbench_branches ), 0 ); The insert fails with a foreign key violation constraint exception because before the user-defined trigger could fire and insert a row in pgbench_branches, the trigger related to the referential integrity constraint (RI_ConstraintTrigger_c_22663) was fired, which rejected the row: pgtraining=> insert into public.pgbench_tellers (tid, bid, tbalance) pgtraining-> values ( pgtraining(> ( select max(tid)+1 from public.pgbench_tellers ), pgtraining(> ( select max(bid)+1 from public.pgbench_branches ), pgtraining(> 0 pgtraining(> ); ERROR: insert or update on table "pgbench_tellers" violates foreign key constraint "pgbench_tellers_bid_fkey" DETAIL: Key (bid)=(50001) is not present in table "pgbench_branches". When multiple triggers meet the same firing criteria, they are fired in alphabetical order. For more information, see Overview of Trigger Behavior. If we have a different name for our trigger, it has a different impact: pgtraining=> alter trigger teller_trig ON public.pgbench_tellers RENAME TO "PGBENCH_TELLER_TRIGGER"; ALTER TRIGGER Now let’s execute our insert statement again: pgtraining=> insert into public.pgbench_tellers (tid, bid, tbalance) values ( ( select max(tid)+1 from public.pgbench_tellers ), ( select max(bid)+1 from public.pgbench_branches ), 0 ); INSERT 0 1 What changed in this case was the name of the trigger. Now that the trigger name is in uppercase and starts with P, it can fire before the trigger defined for the referential integrity constraint (RI_ConstraintTrigger_c_22663). This behavior of PostgreSQL isn’t compliant with ANSI specifications and it can cause incompatibility when migrating an application from another relational database. Effect of the Aurora PostgreSQL parameter Now let’s change the parameter ansi_constraint_trigger_ordering in the DB cluster parameter group with the Aurora instance we’re using for these tests: pgtraining=> show ansi_force_foreign_key_checks; ansi_force_foreign_key_checks ------------------------------- on (1 row) pgtraining=> show ansi_constraint_trigger_ordering ; ansi_constraint_trigger_ordering ---------------------------------- on (1 row) pgtraining=> insert into public.pgbench_tellers (tid, bid, tbalance) values ( ( select max(tid)+1 from public.pgbench_tellers ), ( select max(bid)+1 from public.pgbench_branches ), 0 ); ERROR: insert or update on table "pgbench_tellers" violates foreign key constraint "pgbench_tellers_bid_fkey" DETAIL: Key (bid)=(50002) is not present in table "pgbench_branches". This parameter makes sure that PostgreSQL follows the ANSI specifications. It ensures that internal constraint triggers are fired first, followed by user-defined constraint triggers, then user-defined triggers. If you’re using an application that’s compatible with PostgreSQL and prefer to stick to default behavior, you can switch this off in the DB cluster parameter group. Although you can’t change this parameter for a specific transaction or session, a change to this parameter is dynamic and doesn’t require a restart of the DB instance. ansi_force_foreign_key_checks This parameter controls whether Aurora PostgreSQL retains PostgreSQL behavior or complies with ANSI specifications for imposing foreign key constraints when a cascaded action is defined in the constraint. Switching it off reverts back to PostgreSQL behavior. The following description is provided in pg_settings: pgtraining=> select name, short_desc, extra_desc from pg_settings where name like 'ansi_force_foreign_key_checks'; -[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- name | ansi_force_foreign_key_checks short_desc | Ensure referential actions such as cascaded delete or cascaded update will always occur regardless of the various trigger contexs that exist for the action. extra_desc | When turned on, Ensure referential actions such as cascaded delete or cascaded update will always occur regardless of the various trigger contexs that exist for the action. Let’s understand how this parameter affects your query behavior. ANSI standards SQL ANSI standards require that any operation cascaded to a child table because of a referential integrity constraint should be applied irrespective of trigger actions defined on the child table. Let’s consider an example of a trigger that is defined on a child table to trigger upon delete and the trigger is defined to skip deletion and perform some other operation instead. Now if we delete a row from the parent table, it will lead to delete on the child table as well. In such a scenario, the trigger will not impact cascaded delete operation and the rows from the child table will be removed irrespective of the trigger behavior defined. PostgreSQL behavior For PostgreSQL, because the referential integrity constraints are defined as an AFTER trigger, there is a chance that a cascaded delete or cascaded update for a child table is skipped as an effect of a BEFORE trigger that exists on the child table. This leaves the database in an inconsistent state which is hard to debug unless you drop and recreate the foreign key constraint. The inconsistency also makes it hard to trust table metadata information (foreign key constraint) for the purpose of query optimization, such as removing redundant inner joins on the guarantees of referential integrity constraints. Let’s see how it works in a practical scenario. In the following code, we make changes to the foreign key constraint on pgbench_teller and pgbench_accounts so that any DELETE on the parent table (pgbench_branches) is also CASCADED to these child tables: pgtraining=> alter table pgbench_tellers drop constraint pgbench_tellers_bid_fkey; ALTER TABLE pgtraining=> alter table pgbench_tellers add constraint pgbench_tellers_bid_fkey FOREIGN KEY (bid) REFERENCES pgbench_branches(bid) on delete cascade; ALTER TABLE pgtraining=> alter table pgbench_accounts drop constraint pgbench_accounts_bid_fkey ; ALTER TABLE pgtraining=> alter table pgbench_accounts add constraint pgbench_accounts_bid_fkey FOREIGN KEY (bid) REFERENCES pgbench_branches(bid) on delete cascade; ALTER TABLE Now suppose we have a requirement to ensure that tellers are not removed from pgbench_teller if they still hold some balance (if the tbalance is more than 0). Let’s add a trigger on pgbench_teller to reject such deletes: CREATE OR REPLACE FUNCTION public.deny_delete_pgbench_teller() RETURNS trigger LANGUAGE plpgsql AS $function$ begin IF old.tbalance > 0 THEN RETURN NULL; END IF; RETURN OLD; end; $function$ CREATE TRIGGER deny_delete_of_non_zero_teller BEFORE DELETE ON pgbench_tellers FOR EACH ROW EXECUTE PROCEDURE deny_delete_pgbench_teller() ; To test if the trigger is working or not, we can delete a branch which has more than a 0 balance. In my dataset generated by pgbench, row with tid=1 in pgbench_tellers is one such row: pgtraining=> select count(*) from pgbench_tellers where tid=1; count ------- 1 (1 row) pgtraining=> pgtraining=> delete from pgbench_tellers where bid=1; DELETE 0 pgtraining=> select count(*) from pgbench_tellers where tid=1; count ------- 1 (1 row) Now let’s delete a row from the parent table – pgbench_branches: pgtraining=> delete from pgbench_branches where bid=1; After this delete, the database is left in an inconsistent state: pgtraining=> select count(*) from pgbench_branches where bid=1; count ------- 0 (1 row) pgtraining=> select count(*) from pgbench_tellers where bid=1; count ------- 10 (1 row) Not only does this violate the behavior for the foreign key defined by ANSI, it also makes it hard for the database optimizer to perform optimizations like removing redundant joins. For example, the optimization that we discussed in the previous part of this blog series (part 3), with pgbench_v_teller view, can’t be applied. Skipping a join with pgbench_branches would now produces inconsistent results: pgtraining=> set apg_enable_remove_redundant_inner_joins =on; SET pgtraining=> explain analyze select tid,bid,tbalance from pgbench_v_teller; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Hash Join (cost=1673.10..12055.73 rows=500002 width=12) (actual time=11.885..147.258 rows=499992 loops=1) Hash Cond: (teller.bid = branches.bid) -> Seq Scan on pgbench_tellers teller (cost=0.00..9070.02 rows=500002 width=12) (actual time=0.004..33.633 rows=500002 loops=1) -> Hash (cost=1013.60..1013.60 rows=52760 width=4) (actual time=11.639..11.639 rows=50001 loops=1) Buckets: 65536 Batches: 1 Memory Usage: 2270kB -> Seq Scan on pgbench_branches branches (cost=0.00..1013.60 rows=52760 width=4) (actual time=0.005..5.507 rows=50001 loops=1) Planning Time: 0.146 ms Execution Time: 165.517 ms (8 rows) If we drop and try to recreate the foreign key constraint, it fails: pgtraining=> alter table pgbench_tellers drop constraint pgbench_tellers_bid_fkey ; ALTER TABLE pgtraining=> alter table pgbench_tellers add constraint pgbench_tellers_bid_fkey FOREIGN KEY (bid) REFERENCES pgbench_branches(bid) on delete cascade; ERROR: insert or update on table "pgbench_tellers" violates foreign key constraint "pgbench_tellers_bid_fkey" DETAIL: Key (bid)=(1) is not present in table "pgbench_branches". Let’s fix the data and add a foreign key: pgtraining=> insert into pgbench_branches values (1,(select sum(tbalance) from pgbench_tellers where bid=1), ' '); INSERT 0 1 pgtraining=> alter table pgbench_tellers add constraint pgbench_tellers_bid_fkey FOREIGN KEY (bid) REFERENCES pgbench_branches(bid) on delete cascade; ALTER TABLE Effect of the Aurora PostgreSQL parameter Now use the Amazon Relational Database Service (Amazon RDS) console to change the parameter ansi_force_foreign_key_checks in the DB cluster parameter group. Let’s run the delete statement again: pgtraining=> show ansi_force_foreign_key_checks ; ansi_force_foreign_key_checks ------------------------------- on (1 row) pgtraining=> delete from pgbench_branches where bid=1; ERROR: Attempt to suppress referential action with before trigger. CONTEXT: SQL statement "DELETE FROM ONLY "public"."pgbench_tellers" WHERE $1 OPERATOR(pg_catalog.=) "bid"" pgtraining=> pgtraining=> pgtraining=> pgtraining=> pgtraining=> select count(*) from pgbench_branches where bid=1; count ------- 1 (1 row) pgtraining=> select count(*) from pgbench_tellers where bid=1; count ------- 10 (1 row) When ansi_force_foreign_key_checks is enabled, Aurora PostgreSQL makes sure that the referential integrity constraint is enforced irrespective of trigger context for user-defined triggers. If the triggers attempt to suppress or skip the cascaded DELETE or cascaded UPDATE, the original action on the parent table is also rolled back. If you’re using an application that’s compatible with PostgreSQL and prefer to stick to the default behavior, you can switch this off in the DB cluster parameter group. Although you can’t change this parameter for a specific transaction or session, a change to this parameter is dynamic and doesn’t require a restart of the DB instance. ansi_qualified_update_set_target This parameter controls if Aurora PostgreSQL retains PostgreSQL’s default behavior when parsing the column name in the SET clause of an UPDATE statement, or if it accepts syntax that’s consistent with what’s allowed by Oracle and SQL Server. The following is the description in the pg_settings view: pgtraining=> select name, short_desc, extra_desc from pg_settings where name like 'ansi_qualified_update_set_target'; -[ RECORD 1 ]------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ name | ansi_qualified_update_set_target short_desc | Support table and schema qualifiers in UPDATE ... SET statements. e.g. UPDATE t SET s.t.c = v WHERE p.off provides community PostgreSQL semantics and on provides this feature. extra_desc | When turned on, the UPDATE ... SET syntax is consistent with what's allowed by Oracle and SQL/Server, and can reduce migration effort. PostgreSQL allows composite types subfields to be set using syntax that is potentially ambiguous with respect to the syntax that Oracle and SQL/Server accept. In cases where the syntax is ambiguous, an ERROR message will be raised to inform the user that the SET target is ambiguous. PostgreSQL behavior The following code is the UPDATE syntax as per PostgreSQL documentation: [ WITH [ RECURSIVE ] with_query [, ...] ] UPDATE [ ONLY ] table_name [ * ] [ [ AS ] alias ] SET { column_name = { expression | DEFAULT } | ( column_name [, ...] ) = [ ROW ] ( { expression | DEFAULT } [, ...] ) | ( column_name [, ...] ) = ( sub-SELECT ) } [, ...] [ FROM from_item [, ...] ] [ WHERE condition | WHERE CURRENT OF cursor_name ] [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ] The column_name can be specified in the table named by table_name. The column name can be qualified with a subfield name or array subscript, if needed. Don’t include the table’s name in the specification of a target column. For example, UPDATE table_name SET table_name.col = 1 is invalid. PostgreSQL throws an exception if UPDATE queries have a fully qualified table name or even a column name with a table name prefix in the SET clause: pgtraining=> update public.pgbench_tellers set pgbench_tellers.tbalance=1000 where tid=1; ERROR: column "pgbench_tellers" of relation "pgbench_tellers" does not exist LINE 1: update public.pgbench_tellers set pgbench_tellers.tbalance=1... ^ pgtraining=> pgtraining=> update public.pgbench_tellers set public.pgbench_tellers.tbalance=1000 where tid=1; ERROR: column "public" of relation "pgbench_tellers" does not exist LINE 1: update public.pgbench_tellers set public.pgbench_tellers.tba... As the exception explains, PostgreSQL is expecting pgbench_terllers or public to be a column and tbalance to be a subfield. This behavior of PostgreSQL can increase the effort required to port application code when migrating from commercial engines like Oracle and SQL Server. Effect of the Aurora PostgreSQL parameter The commercial engines allow both syntaxes and we can set Aurora PostgreSQL to behave the same way by setting ansi_qualified_update_set_target to on: pgtraining=> set ansi_qualified_update_set_target=on; SET pgtraining=> pgtraining=> update public.pgbench_tellers set public.pgbench_tellers.tbalance=1000 where tid=1; UPDATE 1 pgtraining=> update public.pgbench_tellers set pgbench_tellers.tbalance=1000 where tid=1; UPDATE 1 This can be useful when migrating an application from Oracle or SQL Server. It allows UPDATE statements in your application with Oracle or SQL Server-compatible syntax to work with Aurora PostgreSQL with little or no change. You can set this parameter in the DB cluster parameter group or at the session level. A change to this parameter doesn’t require a restart of the DB instance. Conclusion Aurora PostgreSQL has several levers by way of parameters that allow for ANSI compliance, which helps a great deal if you’re migrating from other engines to Aurora PostgreSQL. As part of this four part blog series, we covered parameters related to memory and query plan management in part one. In part two, we covered replication, security and logging parameters. Part three and part four, covered detailed explanation of parameters that can be used to modify Aurora PostgreSQL behavior to improve query performance and increase adherence to ANSI standard, which is helpful while migrating applications from other database engines. AWS continues to iterate on customer feedback and improve Aurora, which offers enterprise-grade features on popular open-source database engines. About the authors Sameer Kumar is a Database Specialist Technical Account Manager at Amazon Web Services. He focuses on Amazon RDS, Amazon Aurora and Amazon DocumentDB. He works with enterprise customers providing technical assistance on database operational performance and sharing database best practices.       Gopalakrishnan Subramanian is a Database Specialist solutions architect at Amazon Web Services. He works with our customers to provide guidance and technical assistance on database projects, helping them improving the value of their solutions when using AWS https://aws.amazon.com/blogs/database/amazon-aurora-postgresql-parameters-part-4-ansi-compatibility-options/
0 notes
programmingsolver · 6 years ago
Text
Assignment 2 Solution
Aims
This assignment aims to give you practice in
further use of SQL and PLpgSQL
writing scripts in PHP that interact with a database
The goal is to complete the functionality of some command-line tools via a combination of database code and PHP code.
Summary
Submission: Login to Course Web Site > Assignments > Assignment 2 > Assignment 2 Specification > Make Submission > upload required files >…
View On WordPress
0 notes
dhamaniasad · 6 years ago
Link
dhamaniasad starred mapbox/node-sqlite3 May 18, 2019
mapbox/node-sqlite3
Asynchronous, non-blocking SQLite3 bindings for Node.js
PLpgSQL 3.7k Updated May 18
0 notes
ungdungmoi · 7 years ago
Text
[Học WebGIS nâng cao] – Bài toán tìm đường với PostGIS+ pgRouting
Tiếp nối serie trước về Webgis sử dụng GeoServer, PostGIS, OpenLayer, series này sẽ giới thiệu các bạn một bài toán khá phổ biến với webgis l�� bài toán tìm đường đi ngắn nhất.
Tiếp nối các công cụ của bài trước (GeoServer, PostGreSQL, PostGIS, OpenLayer) thì bài này chúng ta sẽ cần thêm một công cụ nữa, đó là pgRouting, một extension của PostGIS cung cấp cho chúng ta giải pháp tìm đường với các thuật toán khác nhau. Các bạn có thể tìm hiểu thêm về pgRouting ở đây: http://pgrouting.org/
Cài đặt
Với các bạn sử dụng PostGIS bản mới nhất hiện nay (PostGIS 2.2.2  và PostGreSQL 9.5) thì pgRouting đã tích hợp sẵn vào bộ cài đặt, các bạn sẽ không phải cài thêm nữa, còn những bản cũ hơn ví dụ như bảnPostGreSQL 9.1 và PostGIS 2.0 không tích hợp sẵn pgRouting thì chúng ta sẽ tìm kiếm bộ cài tại đây: http://postgis.net/windows_downloads/ ví dụ mình tìm thấy link cho bản PostGreSQL 9.1 tại đây: http://winnie.postgis.net/download/windows/pg91/buildbot/. Sau đó chọn vào bản pgrouting-pg91-binaries-2.0.0devw64.zip để tải về.
Sau khi tải về chúng ta sẽ có 1 file zip, giải nén ra chúng ta sẽ được các thư mục bin, lib, share và các file khác, copy toàn bộ các thư mục và các file đó, paste vào thư mục cài PostGreSQL, ví dụ của mình là: C:\Program Files\PostgreSQL\9.1.
Tiếp theo chúng ta tạo Extension cho pgRouting:
Vào pgAdmin, chọn SQL rồi gõ lệnh sau
CREATE EXTENSION pgrouting;
Nếu các bạn cài PostGIS mà đã tích hợp sẵn pgRouting thì không cần làm bước này.
Kiểm tra xem bạn đã cài đặt thành công hay chưa và phiên bản pgRouting là bao nhiêu, bạn gõ lệnh sau
SELECT pgr_version();
Vậy là xong phần cài đặt.
Chuẩn bị dữ liệu
Để có dữ liệu cho bài toán tìm đường, chúng ta phải có một layer làm route dạng polyline.
Các bạn có thể tự tạo dữ liệu cho mình hoặc có thể tải của OpenStreetMap tại đây: http://download.geofabrik.de/asia.html . Sau khi tải xong chúng ta sẽ có một thư mục gồm nhiều file shp. Chúng ta sẽ sử dụng file roads.shp làm dữ liệu tìm đường.
Dữ liệu của OpenStreetMap là bản đồ đường đi của cả Việt Nam ( và nhiều nơi khác) nên khi sử dụng sẽ load hơi nặng. Nếu các bạn chỉ cần làm dữ liệu demo thì nên cắt ra một vùng nhỏ hơn để làm dữ liệu như hình dưới. Hệ quy chiếu của data là EPSG: 4236 WGS 84.
Clip thì chúng ta có thể dùng tool clip của QGIS. Đầu tiên chúng ta phải chuẩn bị một layer dạng polygon vẽ vùng chọn để clip (các bạn có thể vào Layer > Create Layer rồi chọn một loại layer chúng ta muốn tạo) . Sau đó các bạn vào Processing > ToolBox để bật toolbox. Trong toolbox chúng ta gõ clip vào ô tìm kiếm để mở chức năng clip
Chọn layer để clip, layer dùng làm polygon sẽ clip theo và file shp lưu ra. Ấn Run để hoàn tất việc clip
Lưu ý: Data của OpenStreetMap sẽ không thể sử dụng được luôn để đưa vào tạo route vì pgRouting đòi hỏi mỗi đoạn giao cắt sẽ phải là một line riêng. Vì vậy chúng ta sẽ phải tiến hành phá khối các polyline thành các line. Để phá khối polyline chúng ta sử dụng công cụ Explore của QGIS như sau:
Vào Processing > ToolBox để bật toolbox. Trong toolbox chúng ta gõ Explore  vào ô tìm kiếm để mở chức năng Explore .
Chọn layer để phá khối, chọn đường dẫn file sau khi đã phá khối và ấn Run
3. File shp mới sẽ được tạo ra với dữ liệu đã được phá khối.
Tiếp theo chúng ta sẽ đưa layer này vào database postGIS của chúng ta. Cách đưa các bạn có thể sử dụng QGIS như hướng dẫn sau: https://ungdungmoi.edu.vn/nang-cao-hieu-qua-voi-qgis.html. Chúng ta sẽ đặt tên bảng dữ liệu là roads.
Quay trở lại pgAdmin, để cho layer roads của chúng ta có thể tìm đường được chúng ta phải tạo  Topology cho nó theo các bước sau:
Mở pgAdmin và chọn vào database chứa bảng roads và chọn SQL .
  2. Chúng ta thêm 2 trường vào bảng roads tạo ở bước trước như sau:
alter table public.roads add column source integer;
alter table public.roads add column target integer;
  3. Tạo topology cho roads như sau
select pgr_createTopology('public.roads', 0.0001, 'geom', 'id');
Các tham số như sau: tên bảng , độ phân giải, tên trường lưu geometry, tên trường id. Ở đây mình chọn độ phân giải là 0.0001, các bạn có thể để độ phân giải khác.
4. Ấn Run để chạy lệnh trên. Nếu thực hiện thành công sẽ có thông báo như hình dưới:
5. Sau khi chạy xong, pgRouting sẽ tạo cho chúng ta một bảng nữa có tên là roads_vertices_pgr
Vậy là đã xong phần tạo dữ liệu, để test chức năng chúng ta sẽ sử dụng QGIS như sau:
Mở QGIS, vào Database > DB Manager > DB Manager
Chắc chắn rằng bạn đã connect đến PostGIS, chọn vào database bạn đã connect đến, vào menu Database > SQL window gõ lệnh sau rồi ấn Execute (F5)
SELECT seq, id1 AS node, id2 AS edge, cost, geom
FROM pgr_dijkstra(
'SELECT id, source, target, st_length(geom) as cost FROM public.roads',
1, 3000, false, false
) as di
JOIN public.roads pt
ON di.id2 = pt.id ;
3. Chọn Load as new layer để xem kết quả
hiển thị như hình dưới
  Vậy là chúng ta đã tạo route thành công.
Tạo layer Route trong Geoserver
Ở trên chúng ta đã biết cách tìm đường với hàm pgr_dijkstra với cú pháp như sau: pgr_dijkstra(sql, source, target, directed). Hàm này sẽ tìm kiếm dựa vào 2 trường target và source trong bảng roads, đây là id của các điểm nút tạo được trong roads_vertices_prg. Để việc tìm kiếm trực quan hơn, ta cần tạo một hàm có thể tìm kiếm dựa vào tọa độ điểm đầu và tọa độ điểm cuối, đặt tên là pgr_fromAtoB như dưới đây:
CREATE OR REPLACE FUNCTION pgr_fromAtoB(
IN tbl varchar,
IN x1 double precision,
IN y1 double precision,
IN x2 double precision,
IN y2 double precision,
OUT seq integer,
OUT gid integer,
OUT name text,
OUT heading double precision,
OUT cost double precision,
OUT geom geometry
)
RETURNS SETOF record AS
$BODY$
DECLARE
sql text;
rec record;
source integer;
target integer;
point integer;
BEGIN
-- Find nearest node
EXECUTE 'SELECT id::integer FROM roads_vertices_pgr
ORDER BY the_geom <-> ST_GeometryFromText(''POINT('
|| x1 || ' ' || y1 || ')'',4326) LIMIT 1' INTO rec;
source := rec.id;
EXECUTE 'SELECT id::integer FROM roads_vertices_pgr
ORDER BY the_geom <-> ST_GeometryFromText(''POINT('
|| x2 || ' ' || y2 || ')'',4326) LIMIT 1' INTO rec;
target := rec.id;
-- Shortest path query (TODO: limit extent by BBOX)
seq := 0;
sql := 'SELECT id, geom, name, cost, source, target,
ST_Reverse(geom) AS flip_geom FROM ' ||
'pgr_dijkstra(''SELECT id , source::int, target::int, '
|| 'st_length(geom) as cost FROM '
|| quote_ident(tbl) || ''', '
|| source || ', ' || target
|| ' , false, false), '
|| quote_ident(tbl) || ' WHERE id2 = id ORDER BY seq';
-- Remember start point
point := source;
FOR rec IN EXECUTE sql
LOOP
-- Flip geometry (if required)
IF ( point != rec.source ) THEN
rec.geom := rec.flip_geom;
point := rec.source;
ELSE
point := rec.target;
END IF;
-- Calculate heading (simplified)
EXECUTE 'SELECT degrees( ST_Azimuth(
ST_StartPoint(''' || rec.geom::text || '''),
ST_EndPoint(''' || rec.geom::text || ''') ) )'
INTO heading;
-- Return record
seq := seq + 1;
gid := rec.id;
name := rec.name;
cost := rec.cost;
geom := rec.geom;
RETURN NEXT;
END LOOP;
RETURN;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE STRICT;
  Hàm trên sẽ tìm trong roads_vertices_pgr để tìm các điểm gần nhất với điểm chúng ta nhập vào để làm target, source, rồi sử dụng 2 điểm đó đưa vào hàm pgr_dijkstra  để tìm route.
Hàm trên mình tham khảo trong hướng dẫn của pgRouting tại http://workshop.pgrouting.org/chapters/wrapper.html. Tuy nhiên có chỉnh sửa lại cho đúng với tên bảng, tên các trường dữ liệu, thay trường length trong tài liệu = st_length(geom) as cost. v..v. Các bạn cũng sẽ phải thay tên bảng roads_vertices_pgr, tên trường id, geom cho đúng với csdl của bạn.
Chúng ta copy hàm trên vào ô nhập sql của csdl, chỉnh sửa cho đúng với csdl của mình rồi ấn run, hàm này sẽ được nạp vào trong csdl.
Tiếp theo chúng ta vào trang quản trị GeoServer. Chúng ta tạo workspace và store cho layer theo hướng dẫn Public Data với GeoServer
Đến bước tạo layer, chúng ta sẽ dử dụng công cụ Configure new SQL view… của GeoServer. Chúng ta chọn Configure new SQL view…
Bạn điền View Name là route, SQL statement nhập như sau:
SELECT (route.geom) FROM (
SELECT geom FROM pgr_fromAtoB('roads', %x1%, %y1%, %x2%, %y2%
) ORDER BY seq) AS route
  Tiếp theo chúng ta chọn Guess parameters from SQL, nhập Default value =0 và Validation regular expression = ^-?[\d.]+$
Trong phần Attributes, ấn Refesh, trong Type bạn chọn LineString, SRID chọn hệ tọa độ của data, ở đây là 4326.
Ấn Save để hoàn tất bước này. Tiếp theo chúng ta điền các thông tin khác cho layer. Bạn có thể không cần điền thêm gì, chỉ cần ấn Compute from data và Compute from native bounds để tạo tạo độ khung cho layer.
  Ấn Save để hoàn tất việc tạo Layer.
Vậy là chúng ta đã tạo xong layer route trên GeoServer, bài sau chúng ta sẽ tìm hiểu cách sử dụng layer này để tìm đường với OpenLayer.
Tham khảo: http://workshop.pgrouting.org/chapters/geoserver.html
Lưu ý: hàm ST_MakeLine trong tài liệu của pgRouting không chạy được nên tạm thời mình bỏ đi, chúng ta có thể không cần gộp vào 1 line mà đưa ra nhiều geometry vẫn hiển thị được route.
Hiển thị route trên WebGIS với OpenLayer
Đầu tiên chúng ta sẽ sử dụng một layer roads để hiển thị đường làm bản đồ nền, các bạn đọc bài Ứng dụng WebGIS với Openlayer để xem cách tạo WebGIS sử dụng OpenLayer, GeoServer.
Layer của mình sẽ để như sau:
var format = 'image/png';
var bounds = [105.671539306641, 20.8914451599121,
105.982925415039, 21.186128616333];
roads = new ol.layer.Image({
source: new ol.source.ImageWMS({
url: 'http://localhost:8080/geoserver/dc/wms',
params: {
'FORMAT': format,
'VERSION': '1.1.1',
STYLES: '',
LAYERS: 'dc:roads',
}
})
});
var projection = new ol.proj.Projection({
code: 'EPSG:4326',
units: 'degrees',
axisOrientation: 'neu'
});
var view = new ol.View({
projection: projection
});
var map = new ol.Map({
target: 'map',
layers: [
roads
],
view: view
});
//map.getView().fitExtent(bounds, map.getSize());
map.getView().fit(bounds, map.getSize());
  Chúng ta vẫn cần 1 thẻ div để hiển thị map
<div id="map" class="map"></div>
Chúng ta sẽ cần thêm 2 textbox để hiển thị tọa độ của điểm chúng ta chọn trên bản đồ và các nút tìm đường và xóa đường.
<input type="text" id="txtPoint1" />
<br />
<input type="text" id="txtPoint2" />
<br />
<button id="btnSolve">Tìm đường</button>
<button id="btnReset">Xóa đường</button>
Chúng ta sẽ viết thêm sự kiện click vào map để lấy tọa độ điểm chọn trên map và lưu vào biến startpoint và endpoint sau đó add 2 điểm này lên bản đồ bằng cách đưa vào 1 vector layer
var startPoint = new ol.Feature();
var destPoint = new ol.Feature();
var vectorLayer = new ol.layer.Vector({
source: new ol.source.Vector({
features: [startPoint, destPoint]
})
});
map.on('singleclick', function (evt) {
if (startPoint.getGeometry() == null) {
// First click.
startPoint.setGeometry(new ol.geom.Point(evt.coordinate));
$("#txtPoint1").val(evt.coordinate);
} else if (destPoint.getGeometry() == null) {
// Second click.
destPoint.setGeometry(new ol.geom.Point(evt.coordinate));
$("#txtPoint2").val(evt.coordinate);
}
});
Trong sự kiện click nút tìm đường, chúng ta sẽ gọi layer route chúng ta tạo ở trong GeoServer, đưa params là điểm đầu điểm cuối vào layer như sau:
var result;
$("#btnSolve").click(function () {
var startCoord = startPoint.getGeometry().getCoordinates();
var destCoord = destPoint.getGeometry().getCoordinates();
var params = {
LAYERS: 'dc:route',
FORMAT: 'image/png'
};
var viewparams = [
'x1:' + startCoord[0], 'y1:' + startCoord[1],
'x2:' + destCoord[0], 'y2:' + destCoord[1]
];
params.viewparams = viewparams.join(';');
result = new ol.layer.Image({
source: new ol.source.ImageWMS({
url: 'http://localhost:8080/geoserver/dc/wms',
params: params
})
});
map.addLayer(result);
});
Cuối cùng là sự kiện xóa đường như sau:
$("#btnReset").click(function () {
startPoint.setGeometry(null);
destPoint.setGeometry(null);
// Remove the result layer.
map.removeLayer(result);
});
Kết quả của chúng ta sẽ được như sau:
  Code toàn bộ bài:
<html xmlns="http://www.w3.org/1999/xhtml">
<head >
<title>Openlayers test</title>
<link rel="stylesheet" href="http://openlayers.org/en/v3.15.1/css/ol.css" type="text/css" />
<script src="http://openlayers.org/en/v3.15.1/build/ol.js" type="text/javascript"></script>
<script src="https://code.jquery.com/jquery-1.12.3.min.js" type="text/javascript"></script>
<style>
.map, .righ-panel {
height: 500px;
width: 40%;
float: left;
}
.map {
border: 1px solid #000;
}
</style>
<script type="text/javascript">
var roads, result;
$("#document").ready(function () {
var startPoint = new ol.Feature();
var destPoint = new ol.Feature();
var format = 'image/png';
var bounds = [105.671539306641, 20.8914451599121,
105.982925415039, 21.186128616333];
roads = new ol.layer.Image({
source: new ol.source.ImageWMS({
url: 'http://localhost:8080/geoserver/dc/wms',
params: {
'FORMAT': format,
'VERSION': '1.1.1',
STYLES: '',
LAYERS: 'dc:roads',
}
})
});
var projection = new ol.proj.Projection({
code: 'EPSG:4326',
units: 'degrees',
axisOrientation: 'neu'
});
var view = new ol.View({
projection: projection
});
var map = new ol.Map({
target: 'map',
layers: [
roads
],
view: view
});
//map.getView().fitExtent(bounds, map.getSize());
map.getView().fit(bounds, map.getSize());
var vectorLayer = new ol.layer.Vector({
source: new ol.source.Vector({
features: [startPoint, destPoint]
})
});
map.addLayer(vectorLayer);
map.on('singleclick', function (evt) {
if (startPoint.getGeometry() == null) {
// First click.
startPoint.setGeometry(new ol.geom.Point(evt.coordinate));
$("#txtPoint1").val(evt.coordinate);
} else if (destPoint.getGeometry() == null) {
// Second click.
destPoint.setGeometry(new ol.geom.Point(evt.coordinate));
$("#txtPoint2").val(evt.coordinate);
}
});
$("#btnSolve").click(function () {
var startCoord = startPoint.getGeometry().getCoordinates();
var destCoord = destPoint.getGeometry().getCoordinates();
var params = {
LAYERS: 'dc:route',
FORMAT: 'image/png'
};
var viewparams = [
'x1:' + startCoord[0], 'y1:' + startCoord[1],
'x2:' + destCoord[0], 'y2:' + destCoord[1]
];
params.viewparams = viewparams.join(';');
result = new ol.layer.Image({
source: new ol.source.ImageWMS({
url: 'http://localhost:8080/geoserver/dc/wms',
params: params
})
});
map.addLayer(result);
});
$("#btnReset").click(function () {
startPoint.setGeometry(null);
destPoint.setGeometry(null);
// Remove the result layer.
map.removeLayer(result);
});
});
</script>
</head>
<body>
<div id="map" class="map"></div>
<div class="righ-panel">
<input type="text" id="txtPoint1" />
<br />
<input type="text" id="txtPoint2" />
<br />
<button id="btnSolve">Tìm đường</button>
<button id="btnReset">Xóa đường</button>
</div>
</body>
</html>
  Chú ý: Bạn phải thiết lập style cho layer route để phân biệt với layer roads. Cách thiết lập style cho layer trong GeoServer các bạn xem ở đây nhé: Nâng cao hiệu quả với QGIS
Tham khảo: http://workshop.pgrouting.org/chapters/ol3_client.html
    Tác giả: Đỗ Xuân Cường
Nguồn bài viết: cuongdx313.wordpress.com
Tham khảo bài gốc ở : [Học WebGIS nâng cao] – Bài toán tìm đường với PostGIS+ pgRouting
0 notes
info-comp · 1 year ago
Text
В данном материале рассмотрено несколько способов создания табличных функций в PostgreSQL, т.е. функций, которые возвращают табличные данные. Приведены примеры создания функций как на языке SQL, так и на языке PL/pgSQL
0 notes
banker-hacker · 10 years ago
Text
#plpgSQL function can't just run a query; you have to put the results somewhere.
http://stackoverflow.com/a/16688055/1964078
but you can’t also RETURN QUERY on an anonymous function. so answer is to create a temp table:
http://www.ienablemuch.com/2010/12/return-results-from-anonymous-code.html
0 notes
vsumner-blog-blog · 12 years ago
Text
Catch unique exception in plpgsql
BEGIN  INSERT INTO db(a,b) VALUES (key, data); RETURN; EXCEPTION WHEN unique_violation THEN -- do nothing END;
1 note · View note
globalmediacampaign · 4 years ago
Text
Migrating user-defined types from Oracle to PostgreSQL
Migrating from commercial databases to open source is a multistage process with different technologies, starting from assessment, data migration, data validation, and cutover. One of the key aspects for any heterogenous database migration is data type conversion. In this post, we show you a step-by-step approach to migrate user-defined types (UDT) from Oracle to Amazon Aurora PostgreSQL or Amazon RDS for PostgreSQL. We also provide an overview of custom operators to use in SQL queries to access tables with UDT in PostgreSQL. Migrating UDT from Oracle to Aurora PostgreSQL or Amazon RDS for PostgreSQL isn’t always straightforward, especially with UDT member functions. UDT defined in Oracle and PostgreSQL store structured business data in its natural form and work efficiently with applications using object-oriented programming techniques. UDT in Oracle can have both the data structure and the methods that operate on that data within the relational model. Though similar, the approaches to implement UDT in Oracle and PostgreSQL with member functions have subtle differences. Overview At a high level, migrating tables with UDT from Oracle to PostgreSQL involves following steps: Converting UDT – You can use the AWS Schema Conversion Tool (AWS SCT) to convert your existing database schema from one database engine to another. Unlike PostgreSQL, user-defined types in Oracle allow PL/SQL-based member functions to be a part of UDT. Because PostgreSQL doesn’t support member functions in UDT, you need to handle them separately during UDT conversion. Migrating data from tables with UDT – AWS Database Migration Service (AWS DMS) helps you migrate data from Oracle databases to Aurora PostgreSQL and Amazon RDS for PostgreSQL. However, as of this writing, AWS DMS doesn’t support UDT. This post explains using the open-source tool Ora2pg to migrate tables with UDT from Oracle to PostgreSQL. Prerequisites Before getting started, you must have the following prerequisites: The AWS SCT installed on a local desktop or an Amazon Elastic Compute Cloud (Amazon EC2) instance. For instructions, see Installing, verifying, and updating the AWS SCT. Ora2pg installed and set up on an EC2 instance. For instructions, see the Ora2pg installation guide. Ora2pg is an open-source tool distributed via GPLv3 license. EC2 instances used for Ora2pg and the AWS SCT should have connectivity to the Oracle source and PostgreSQL target databases.  Dataset This post uses a sample dataset of a sporting event ticket management system. For this use case, the table DIM_SPORT_LOCATION_SEATS with event location seating details has been modified to include location_t as a UDT. location_t has information of sporting event locations and seating capacity. Oracle UDT location_t The UDT location_t has attributes describing sporting event location details, including an argument-based member function to compare current seating capacity of the location with expected occupancy for a sporting event. The function takes expected occupancy for the event as an argument and compares it to current seating capacity of the event location. It returns t if the sporting event location has enough seating capacity for the event, and f otherwise. See the following code: create or replace type location_t as object ( LOCATION_NAME VARCHAR2 (60 ) , LOCATION_CITY VARCHAR2 (60 ), LOCATION_SEATING_CAPACITY NUMBER (7) , LOCATION_LEVELS NUMBER (1) , LOCATION_SECTIONS NUMBER (4) , MEMBER FUNCTION COMPARE_SEATING_CAPACITY(capacity in number) RETURN VARCHAR2 ); / create or replace type body location_t is MEMBER FUNCTION COMPARE_SEATING_CAPACITY(capacity in number) RETURN VARCHAR2 is seat_capacity_1 number ; seat_capacity_2 number ; begin if ( LOCATION_SEATING_CAPACITY is null ) then seat_capacity_1 := 0; else seat_capacity_1 := LOCATION_SEATING_CAPACITY; end if; if ( capacity is null ) then seat_capacity_2 := 0; else seat_capacity_2 := capacity; end if; if seat_capacity_1 >= seat_capacity_2 then return 't'; else return 'f'; end if; end COMPARE_SEATING_CAPACITY; end; / Oracle table DIM_SPORT_LOCATION_SEATS The following code shows the DDL for DIM_SPORT_LOCATION_SEATS table with UDT location_t in Oracle: CREATE TABLE DIM_SPORT_LOCATION_SEATS ( SPORT_LOCATION_SEAT_ID NUMBER NOT NULL , SPORT_LOCATION_ID NUMBER (3) NOT NULL , LOCATION location_t, SEAT_LEVEL NUMBER (1) NOT NULL , SEAT_SECTION VARCHAR2 (15) NOT NULL , SEAT_ROW VARCHAR2 (10 BYTE) NOT NULL , SEAT_NO VARCHAR2 (10 BYTE) NOT NULL , SEAT_TYPE VARCHAR2 (15 BYTE) , SEAT_TYPE_DESCRIPTION VARCHAR2 (120 BYTE) , RELATIVE_QUANTITY NUMBER (2) ) ; Converting UDT Let’s start with the DDL conversion of location_t and the table DIM_SPORT_LOCATION_SEATS from Oracle to PostgreSQL. You can use the AWS SCT to convert your existing database schema from Oracle to PostgreSQL. Because the target PostgreSQL database doesn’t support member functions in UDT, the AWS SCT ignores the member function during UDT conversion from Oracle to PostgreSQL. In PostgreSQL, we can create functions in PL/pgSQL with operators to have similar functionality as Oracle UDT does with member functions. For this sample dataset, we can convert location_t, to PostgreSQL using the AWS SCT. The AWS SCT doesn’t convert the DDL of member functions for location_t from Oracle to PostgreSQL. The following screenshot shows our SQL code. PostgreSQL UDT location_t The AWS SCT converts LOCATION_LEVELS and LOCATION_SECTIONS from the location_t UDT to SMALLINT for Postgres optimizations based on schema mapping rules. See the following code: create TYPE location_t as ( LOCATION_NAME CHARACTER VARYING(60) , LOCATION_CITY CHARACTER VARYING(60) , LOCATION_SEATING_CAPACITY INTEGER , LOCATION_LEVELS SMALLINT , LOCATION_SECTIONS SMALLINT ); For more information about schema mappings, see Creating mapping rules in the AWS SCT. Because PostgreSQL doesn’t support member functions in UDT, the AWS SCT ignores them while converting the DDL from Oracle to PostgreSQL. You need to write a PL/pgSQL function separately. In order to write a separate entity, you may need to add additional UDT object parameters to the member function. For our use case, the member function compare_seating_capacity is rewritten as a separate PL/pgSQL function. The return data type for this function is bool instead of varchar2 (in Oracle), because PostgreSQL provides a bool data type for true or false. See the following code: CREATE or REPLACE FUNCTION COMPARE_SEATING_CAPACITY (event_loc_1 location_t,event_loc_2 integer) RETURNS bool AS $$ declare seat_capacity_1 integer; seat_capacity_2 integer ; begin if ( event_loc_1.LOCATION_SEATING_CAPACITY is null ) then seat_capacity_1 = 0 ; else seat_capacity_1 = event_loc_1.LOCATION_SEATING_CAPACITY; end if; if ( event_loc_2 is null ) then seat_capacity_2 = 0 ; else seat_capacity_2 = event_loc_2 ; end if; if seat_capacity_1 >= seat_capacity_2 then return true; else return false; end if; end; $$ LANGUAGE plpgsql; The UDT conversion is complete yielding the PL/pgSQL function and the UDT in PostgreSQL. You can now create the DDL for tables using this UDT in the PostgreSQL target database using the AWS SCT, as shown in the following screenshot. In the next section, we dive into migrating data from tables containing UDT from Oracle to PostgreSQL. Migrating data from tables with UDT In this section, we use the open-source tool Ora2pg to perform a full load of the DIM_SPORT_LOCATION_SEATS table with UDT from Oracle to PostgreSQL. To install and set up Ora2pg on an EC2 instance, see the Ora2pg installation guide. After installing Ora2pg, you can test connectivity with the Oracle source and PostgreSQL target databases. To test the Oracle connection, see the following code: -bash-4.2$ cd $ORACLE_HOME/network/admin -bash-4.2$ echo "oratest=(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP)(HOST = oratest.xxxxxxx.us-west-2.rds.amazonaws.com )(PORT =1526))(CONNECT_DATA =(SERVER = DEDICATED) (SERVICE_NAME = UDTTEST)))" >> tnsnames.ora -bash-4.2$ sqlplus username/password@oratest SQL*Plus: Release 11.2.0.4.0 Production on Fri Aug 7 05:05:35 2020 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production SQL> To test the Aurora PG connection, see the following code: -bash-4.2$ psql -h pgtest.xxxxxxxx.us-west-2.rds.amazonaws.com -p 5436 -d postgres master Password for user master: psql (9.2.24, server 11.6) WARNING: psql version 9.2, server version 11.0. Some psql features might not work. SSL connection (cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256) Type "help" for help. postgres=> You use a configuration file to migrate data from Oracle to PostgreSQL with Ora2pg. The following is the configuration file used for this sample dataset. Ora2pg has many options to copy and export different object types. In this example, we use COPY to migrate tables with UDT: -bash-4.2$ cat ora2pg_for_copy.conf ORACLE_HOME /usr/lib/oracle/11.2/client64 ORACLE_DSN dbi:Oracle:sid=oratest ORACLE_USER master ORACLE_PWD xxxxxxx DEBUG 1 EXPORT_SCHEMA 1 SCHEMA dms_sample CREATE_SCHEMA 0 COMPILE_SCHEMA 0 PG_SCHEMA TYPE COPY PG_DSN dbi:Pg:dbname=postgres;host=pgtest.xxxxxxxxx.us-west-2.rds.amazonaws.com;port=5436 PG_USER master PG_PWD xxxxxxxx ALLOW DIM_SPORT_LOCATION_SEATS BZIP2 DATA_LIMIT 400 BLOB_LIMIT 100 LONGREADLEN6285312 LOG_ON_ERROR PARALLEL_TABLES 1 DROP_INDEXES 1 WITH_OID 1 FILE_PER_TABLE The configuration file has the following notable settings: SCHEMA – Sets the list of schemas to be exported as part of data migration. ALLOW – Provides a list of objects to migrate. Object names could be space- or comma-separated. You can also use regex like DIM_* to include all objects starting with DIM_ in the dms_sample schema. DROP_INDEXES – Improves data migration performance by dropping indexes before data load and recreating them in the target database post-data migration. TYPE – Provides an export type for data migration. For our use case, we’re migrating data to the target table using COPY statements. This parameter can only have a single value. For more information about the available options in Ora2pg to migrate data from Oracle to PostgreSQL, see the Ora2pg documentation. In the following code, we migrate the DIM_SPORT_LOCATION_SEATS table from Oracle to PostgreSQL using the configuration file created previously: -bash-4.2$ ora2pg -c ora2pg_for_copy.conf -d Ora2Pg version: 18.1 Trying to connect to database: dbi:Oracle:sid=oratest Isolation level: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE Retrieving table information... [1] Scanning table DIM_SPORT_LOCATION_SEATS (2 rows)... Trying to connect to database: dbi:Oracle:sid=oratest Isolation level: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE Retrieving partitions information... Dropping indexes of table DIM_SPORT_LOCATION_SEATS... Looking how to retrieve data from DIM_SPORT_LOCATION_SEATS... Data type LOCATION_T is not native, searching on custom types. Found Type: LOCATION_T Looking inside custom type LOCATION_T to extract values... Fetching all data from DIM_SPORT_LOCATION_SEATS tuples... Dumping data from table DIM_SPORT_LOCATION_SEATS into PostgreSQL... Setting client_encoding to UTF8... Disabling synchronous commit when writing to PostgreSQL... DEBUG: Formatting bulk of 400 data for PostgreSQL. DEBUG: Creating output for 400 tuples DEBUG: Sending COPY bulk output directly to PostgreSQL backend Extracted records from table DIM_SPORT_LOCATION_SEATS: total_records = 2 (avg: 2 recs/sec) [========================>] 2/2 total rows (100.0%) - (1 sec., avg: 2 recs/sec). Restoring indexes of table DIM_SPORT_LOCATION_SEATS... Restarting sequences The data from the DIM_SPORT_LOCATION_SEATS table with UDT is now migrated to PostgreSQL. Setting search_path in PostgreSQL allows dms_sample to be the schema searched for objects referenced in SQL statements in this database session, without qualifying them with the schema name. See the following code: postgres=> set search_path=dms_sample; SET postgres=> select sport_location_seat_id,location,seat_level,seat_section,seat_row,seat_no from DIM_SPORT_LOCATION_SEATS; sport_location_seat_id | location | seat_level | seat_section | seat_row | seat_no ------------------------+----------------------------+------------+--------------+----------+--------- 1 | (Germany,Munich,75024,2,3) | 3 | S | 2 | S-8 1 | (Germany,Berlin,74475,2,3) | 3 | S | 2 | S-8 (2 rows) Querying UDT in PostgreSQL Now that both the DDL and data for the table DIM_SPORT_LOCATION_SEATS are migrated to PostgreSQL, we can query the UDT using the newly created PL/pgSQL functions. Querying Oracle with the UDT member function The following code is an example of a SQL query to determine if any stadiums in Germany have a seating capacity of more than 75,000 people. The dataset provides seating capacity information of stadiums in Berlin and Munich: SQL> select t.location.LOCATION_CITY CITY,t.LOCATION.COMPARE_SEATING_CAPACITY(75000) SEATS_AVAILABLE from DIM_SPORT_LOCATION_SEATS t where t.location.LOCATION_NAME='Germany'; CITY SEATS_AVAILABLE --------------------------------- ---------------- Munich t Berlin f The result of this SQL query shows that a stadium in Munich has sufficient seating capacity. However, the event location in Berlin doesn’t have enough seating capacity to host a sporting event of 75,000 people. Querying PG with the PL/pgSQL function The following code is the rewritten query in PostgreSQL, which uses the PL/pgSQL function COMPARE_SEATING_CAPACITY to show the same results: postgres=> select (location).LOCATION_CITY,COMPARE_SEATING_CAPACITY(location,75000) from DIM_SPORT_LOCATION_SEATS where (location).LOCATION_NAME='Germany'; location_city | compare_seating_capacity ---------------+-------------------------- Munich | t Berlin | f (2 rows) Using operators You can also use PostgreSQL operators to simplify the previous query. Every operator is a call to an underlying function. PostgreSQL provides a large number of built-in operators for system types. For example, the built-in integer = operator has the underlying function as int4eq(int,int) for two integers. You can invoke built-in operators using the operator name or its underlying function. The following queries get sport location IDs with only two levels using the = operator and its built-in function int4eq: postgres=> select sport_location_id,(location).location_levels from DIM_SPORT_LOCATION_SEATS where (location).location_levels = 2; sport_location_id | location_levels -------------------+----------------- 2 | 2 3 | 2 (2 rows) postgres=> select sport_location_id,(location).location_levels from DIM_SPORT_LOCATION_SEATS where int4eq((location).location_levels,2); sport_location_id | location_levels -------------------+----------------- 2 | 2 3 | 2 (2 rows) You can use operators to simplify the SQL query that finds stadiums in Germany with a seating capacity of more than 75,000 people. As shown in the following code, the operator >= takes the UDT location_t as the left argument and integer as the right argument to call the compare_seating_capacity function. The COMMUTATOR clause, if provided, names an operator that is the commutator of the operator being defined. Operator X is the commutator of operator Y if (a X b) equals (b Y a) for all possible input values of a and b. In this case, <= acts as commutator to the operator >=. It’s critical to provide commutator information for operators that are used in indexes and join clauses because this allows the query optimizer to flip such a clause for different plan types. CREATE OPERATOR >= ( LEFTARG = location_t, RIGHTARG = integer, PROCEDURE = COMPARE_SEATING_CAPACITY, COMMUTATOR = <= ); The following PostgreSQL query with an operator shows the same results as the Oracle query with the UDT member function: postgres=> select (location).LOCATION_CITY CITY,(location).LOCATION_SEATING_CAPACITY >=75000 from DIM_SPORT_LOCATION_SEATS where (location).LOCATION_NAME='Germany'; city | ?column? --------+---------- Munich | t Berlin | f (2 rows) You can also use the operator >= in the where clause with UDT location_t, just like any other comparison operator. With the help of the user-defined operator >= defined earlier, the SQL query takes the location_t data type as the left argument and integer as the right argument. The following SQL query returns cities in Germany where seating capacity is more than 75,000. postgres=> select (location).LOCATION_CITY from DIM_SPORT_LOCATION_SEATS where (location).LOCATION_NAME='Germany' and location >=75000; location_city --------------- Munich (1 row) Conclusion This post showed you a solution to convert and migrate UDT with member functions from Oracle to PostgreSQL and how to use operators in queries with UDT in PostgreSQL. We hope that you find this post helpful. For more information about moving your Oracle workload to Amazon RDS for PostgreSQL or Aurora PostgreSQL, see Oracle Database 11g/12c To Amazon Aurora with PostgreSQL Compatibility (9.6.x) Migration Playbook. As always, AWS welcomes feedback. If you have any comments or questions on this post, please share them in the comments. About the Authors Manuj Malik is a Senior Data Lab Solutions Architect at Amazon Web Services. Manuj helps customers architect and build databases and data analytics solutions to accelerate their path to production as part of AWS Data Lab. He has an expertise in database migration projects and works with customers to provide guidance and technical assistance on database services, helping them improve the value of their solutions when using AWS.     Devika Singh is a Solutions Architect at Amazon Web Services. Devika has expertise in database migrations to AWS and as part of AWS Data Lab, works with customers to design and build solutions in databases, data and analytics platforms. https://aws.amazon.com/blogs/database/migrating-user-defined-types-from-oracle-to-postgresql/
0 notes
globalmediacampaign · 4 years ago
Text
Federated query support for Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL
PostgreSQL is one of the most widely used database engines and is supported by a very large and active community. It’s a viable open-source option to use compared to many commercial databases, which require heavy license costs. Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL are AWS managed offerings that take away the heavy lifting required for setting up the platform, configuring high availability, monitoring, and much more. This allows DBAs to spend more time on business-critical problems like doing schema design early on or query tuning. In 2003, a new specification called SQL/MED (“SQL Management of External Data”) was added to the SQL standard. It’s a standardized way of handling access to remote objects from SQL databases. In 2011, PostgreSQL 9.1 was released with read-only support of this standard, and in 2013, write support was added with PostgreSQL 9.3. The implementation of this concept in PostgreSQL was called foreign data wrappers (FDW). Several FDWs are available that help connect PostgreSQL Server to different remote data stores, ranging from other SQL database engines to flat files. However, most FDWs are independent open-source projects implemented as Postgres extensions, and not officially supported by the PostgreSQL Global Development Group. In this post, we discuss one FDW implementation that comes with PostgreSQL source as a contrib extension module called postgres_fdw. postgres_fdw allows you to implement federated query capability to interact with any remote PostgreSQL-based database, both managed and self-managed on Amazon Elastic Compute Cloud (Amazon EC2) or on premises. This is available in all present versions supported for Amazon RDS for PostgreSQL and Aurora PostgreSQL. The following diagram illustrates this architecture. Use cases  In this post, we primarily focus on two use cases to give an overview on the capability. However, you can easily extended the solution for other federated query use cases. When working with independent software vendors (ISVs), we occasionally see them offering Amazon RDS for PostgreSQL and Aurora PostgreSQL in a multi-tenant setup in which they use one database per customer and a shared database within a single instance. Federated query capability implemented via FDW allows you to pull data from a shared database to other databases as needed. An organization could have multiple systems catering to different departments. For example, a payroll database has employee salary information. This data maybe required by the HR and tax systems to calculate hike or decide tax incurred, respectively. One solution for such a problem is to copy the salary information in both the HR and Tax systems. However, this kind of duplication may lead to problems, like ensuring data accuracy, extra storage space incurred, or double writes. FDWs avoid duplication while providing access to required data that resides in a different foreign database. The following diagram illustrates this architecture. Furthermore, in today’s world of purpose-built databases you host hot or active data in Amazon RDS for PostgreSQL and Aurora PostgreSQL, you have separate data warehouse solutions like Amazon Redshift for data archival. Without federated query support from an active database, you have to stream the active data to the data warehouse on an almost near-real-time basis to run analytical queries, which requires extra efforts and costs to set up a data pipeline and an additional overhead to the data warehouse. With federated query capability, you can derive insights while joining data at the transactional database from within itself as well as a data warehouse like Amazon Redshift. The following diagram illustrates this architecture. For more information about querying data in your Aurora PostgreSQL or Amazon RDS for PostgreSQL remote server from Amazon Redshift as the primary database, see Querying data with federated queries in Amazon Redshift. Prerequisites Before getting started, make sure you have the following prerequisites:  A primary Aurora PostgreSQL or Amazon RDS for PostgreSQL instance as your source machine. A remote PostgreSQL-based instance with information like username, password, and database name. You can use any of the following databases: Aurora PostgreSQL Amazon RDS for PostgreSQL Amazon Redshift Self-managed PostgreSQL on Amazon EC2 On-premises PostgreSQL database Network connectivity between the primary and remote database. The remote database can be a different database within the same primary database instance, a separate database instance, Amazon Redshift cluster within the same or different Amazon Virtual Private Cloud (Amazon VPC), or even on-premises PostgreSQL-based database servers that can be reached using VPC peering or AWS managed VPN or AWS Direct Connect. The remote PostgreSQL-based database instance or Amazon Redshift cluster can be in a same or different AWS account with established network connectivity in place. For more information, see How can I troubleshoot connectivity to an Amazon RDS DB instance that uses a public or private subnet of a VPC? Tables that you query in the foreign (remote) server. For this post, we create one table in source database pdfdwsource and one table in target database pdfdwtarget. For pdfdwsource, create a salary table with the columns emailid and salary with dummy content. See the following code: [ec2-user@ip-172-31-15-24 ~]$ psql -h pgfdwsource.xxxx.us-west-2.rds.amazonaws.com -d pgfdwsource -U pgfdwsource -w SET Expanded display is on. SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off) Type "help" for help pgfdwsource=> create table salary(emailid varchar, salary int); CREATE TABLE pgfdwsource=> insert into salary values('[email protected]',10000); INSERT 0 1 pgfdwsource=> insert into salary values('[email protected]',100000); INSERT 0 1 pgfdwsource=> insert into salary values('[email protected]',1000000); INSERT 0 1 pgfdwsource=> d salary Table "public.salary" Column | Type | Modifiers ---------+-------------------+----------- emailid | character varying | salary | integer | pgfdwsource=> select * from salary; emailid | salary ---------------------+--------- [email protected] | 10000 [email protected] | 100000 [email protected] | 1000000 For pdfdwtarget, create a table corresponding to customer1 with the columns id, name, emailid, projectname, and contactnumber with dummy content. See the following code: [ec2-user@ip-172-31-15-24 ~]$ psql -h pgfdwtarget.xxxx.us-west-2.rds.amazonaws.com -d pgfdwtarget -U pgfdwtarget SET SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off) Type "help" for help. pgfdwtarget=> create table customer1( id int, name varchar, emailid varchar, projectname varchar, contactnumber bigint); CREATE TABLE pgfdwtarget=> insert into customer1 values(1,'Tom','[email protected]','Customer1.migration',328909432); INSERT 0 1 pgfdwtarget=> insert into customer1 values(2,'Harry','[email protected]','Customer1.etl',2328909432); INSERT 0 1 pgfdwtarget=> insert into customer1 values(3,'Jeff','[email protected]','Customer1.infra',328909432); INSERT 0 1 pgfdwtarget=> d customer1 Table "public.customer1" Column | Type | Modifiers ---------------+-------------------+----------- id | integer | name | character varying | emailid | character varying | projectname | character varying | contactnumber | bigint | pgfdwtarget=> select * from customer1; id | name | emailid | projectname | contactnumber ----+-------+---------------------+---------------------+--------------- 1 | Tom | [email protected] | Customer1.migration | 328909432 2 | Harry | [email protected] | Customer1.etl | 2328909432 3 | Jeff | [email protected] | Customer1.infra | 328909432 (3 rows) In both tables, the column emailid is common and can be used to derive insights into salary corresponding to customer1 employee information. Now let’s see this in action. Configuring your source instance, foreign server, user mapping, and foreign table All the steps in this section are performed after logging in with the role pgfdwsource into the primary database instance pgfdwsource. Connecting to the source instance You connect to your source instance with a master user or via a normal user that has rds_superuser permissions. You can use client tools like psql or pgAdmin. For more information, see Connecting to a DB instance running the PostgreSQL database engine. Create the extension postgres_fdw with CREATE EXTENSION: pgfdwsource=> conninfo You are connected to database "pgfdwsource" as user "pgfdwsource" on host "pgfdwsource.xxxx.us-west-2.rds.amazonaws.com" at port "5432". SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off) pgfdwsource=> du pgfdwsource List of roles Role name | Attributes | Member of -------------+-------------------------------+----------------- pgfdwsource | Create role, Create DB +| {rds_superuser} | Password valid until infinity | pgfdwsource=> create extension postgres_fdw; CREATE EXTENSION pgfdwsource=> dx List of installed extensions Name | Version | Schema | Description --------------+---------+------------+---------------------------------------------------- plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language postgres_fdw | 1.0 | public | foreign-data wrapper for remote PostgreSQL servers (2 rows) Creating a foreign server We use CREATE SERVER to create our foreign (remote) server mapping as the PostgreSQL-based server from which we pull the data. A foreign server typically encapsulates connection information that an FDW uses to access an external data resource. It uses the same connection options as libpq. SSLMODE ‘require’ makes sure that the data is encrypted in transit. See the following code: pgfdwsource=> create server my_fdw_target Foreign Data Wrapper postgres_fdw OPTIONS (DBNAME 'pgfdwtarget', HOST 'pgfdwtarget.xxxx.us-west-2.rds.amazonaws.com', SSLMODE 'require'); CREATE SERVER If the master user (or user with rds_superuser) is creating the foreign server, then other users need usage access to this server. See the following code: GRANT USAGE ON FOREIGN SERVER my_fdw_target TO normal_user; If you want to grant access to normal users to create a foreign server, you need to grant usage on the extension itself from the master user: GRANT USAGE ON FOREIGN DATA WRAPPER postgres_fdw TO normal_user; Creating user mapping CREATE USER MAPPING defines a mapping of a user to a foreign server. In the following code, when the user pgfdwsource is connecting to the server my_fdw_target (remote database), they use the login information provided in the user mapping. Therefore, postgres_fdw uses the user pgfdwtarget to connect to the remote database. pgfdwsource=> CREATE USER MAPPING FOR pgfdwsource SERVER my_fdw_target OPTIONS (user 'pgfdwtarget', password 'test1234'); CREATE USER MAPPING Creating a foreign table CREATE FOREIGN TABLE creates a foreign table in the source database that is mapped to the table in the foreign server. It creates a table reference in the local database of a table residing in the remote database. The name of this table reference is defined as the foreign table. For example, in the following code, foreign table customer1_fdw points to the table customer1, which resides in the remote database. Now, whenever you need to query data from table customer1 in the remote database, the user in local database queries using the foreign table customer1_fdw. pgfdwsource=> create foreign table customer1_fdw( id int, name varchar, emailid varchar, projectname varchar, contactnumber bigint) server my_fdw_target OPTIONS( TABLE_NAME 'customer1'); CREATE FOREIGN TABLE Furthermore, the name of the foreign table or table reference can be different from the name of the table in the remote database. This flexibility can be useful when you want to mask the name of the table in remote database. Running federated queries with a PostgreSQL FDW FDWs (postgres_fdw) allow you to query (SELECT, INSERT, UPDATE, and DELETE) a remote PostgreSQL-based database (Amazon RDS for PostgreSQL, Aurora PostgreSQL, or Amazon Redshift). For this post, we focus on the SELECT query functionality. With the preceding steps complete, you can now query the table customer1, which resides in the Amazon RDS for PostgreSQL or Aurora PostgreSQL instance pgfdwtarget.xxxx.us-west-2.rds.amazonaws.com in the database pgfdwtarget, from within the instance pgfdwsource.xxxx.us-west-2.rds.amazonaws.com without the need for replication or any kind of pipeline. You can also run join queries to perform aggregates (and more) to derive insights. See the following code: pgfdwsource=> conninfo You are connected to database "pgfdwsource" as user "pgfdwsource" on host "pgfdwsource.xxxx.us-west-2.rds.amazonaws.com" at port "5432". SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off) -- running select * on foreign table customer1 mapped to name customer1_fdw in source instance. pgfdwsource=> select * from customer1_fdw; id | name | emailid | projectname | contactnumber ----+-------+---------------------+---------------------+--------------- 1 | Tom | [email protected] | Customer1.migration | 328909432 2 | Harry | [email protected] | Customer1.etl | 2328909432 3 | Jeff | [email protected] | Customer1.infra | 328909432 (3 rows) -- running select * on foreign table customer1 mapped to name customer1_fdw in source instance while joining it with salary table to derive salary insights. pgfdwsource=> select * from salary s inner join customer1_fdw c on s.emailid=c.emailid; emailid | salary | id | name | emailid | projectname | contactnumber ---------------------+---------+----+-------+---------------------+---------------------+--------------- [email protected] | 10000 | 1 | Tom | [email protected] | Customer1.migration | 328909432 [email protected] | 100000 | 2 | Harry | [email protected] | Customer1.etl | 2328909432 [email protected] | 1000000 | 3 | Jeff | [email protected] | Customer1.infra | 328909432 (3 rows) In the following code, only the data for a particular project is queried. The project name details are stored in the remote server. In the explain plan of the query, the WHERE clause has been pushed down to the remote server. This makes sure that only the data required to realize the result of the query is retrieved from the remote server and not the entire table. This is one of the optimizations that postgres_fdw offers. pgfdwsource=> select * from salary s inner join customer1_fdw c on s.emailid=c.emailid where c.projectname like 'Customer1.etl'; emailid | salary | id | name | emailid | projectname | contactnumber ---------------------+--------+----+-------+---------------------+---------------+--------------- [email protected] | 100000 | 2 | Harry | [email protected] | Customer1.etl | 2328909432 (1 row) pgfdwsource=> explain verbose select * from salary s inner join customer1_fdw c on s.emailid=c.emailid where c.projectname like 'Customer1.etl'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=100.00..118.98 rows=1 width=144) Output: s.emailid, s.salary, c.id, c.name, c.emailid, c.projectname, c.contactnumber Join Filter: ((s.emailid)::text = (c.emailid)::text) -> Seq Scan on public.salary s (cost=0.00..1.03 rows=3 width=36) Output: s.emailid, s.salary -> Materialize (cost=100.00..117.83 rows=3 width=108) Output: c.id, c.name, c.emailid, c.projectname, c.contactnumber -> Foreign Scan on public.customer1_fdw c (cost=100.00..117.81 rows=3 width=108) Output: c.id, c.name, c.emailid, c.projectname, c.contactnumber Remote SQL: SELECT id, name, emailid, projectname, contactnumber FROM public.customer1 WHERE ((projectname ~~ 'Customer1.etl'::text)) (10 rows) You can pull data from a different database within the same instance or from a different instance, as seen in the preceding examples. You can also pull data from your PostgreSQL-based data warehouse like Amazon Redshift for a quick query on archived data. Because the data is pulled in real time from a foreign database, you can achieve this without needing to set up any kind of replication or reserving space in the source database. Cleaning up You can follow the steps in this section to clean up the resources created in previous steps: Drop the foreign table customer1_fdw: pgfdwsource=> drop foreign table customer1_fdw; DROP FOREIGN TABLE Drop the user mapping that maps the database user to the foreign server user: pgfdwsource=> drop user mapping for pgfdwsource SERVER my_fdw_target; DROP USER MAPPING Drop the foreign server definition that provides the local PostgreSQL server with the foreign (remote) server connection information: pgfdwsource=> drop server my_fdw_target; DROP SERVER Drop the extension postgres_fdw: pgfdwsource=> drop extension postgres_fdw; DROP EXTENSION Common error messages As is the case with any other setup, you may encounter some common issues while setting up postgres_fdw, like connection problems or permission issues. The following section lists some common error messages that you may encounter and also suggests common culprits for those errors, which should point you in the right direction to mitigate them. Network connectivity – The following error message can appear when trying to query the foreign table customer1_fdw from the source database. A connection timed out error message appears when there is no network connectivity between the source and target databases. Common culprits are incorrectly configured security groups, Network Access Control List (NACL), or route table. pgfdwsource=> select * from customer1_fdw; ERROR: could not connect to server "my_fdw_target" DETAIL: could not connect to server: Connection timed out Is the server running on host "pgfdwtarget.xxxx.us-west-2.rds.amazonaws.com" (172.31.30.166) and accepting TCP/IP connections on port 5432? DNS resolution – The following error message may appear when trying to query the foreign table customer1_fdw from the source database. This is a classic case of DNS resolution failure (issues in /etc/resolv.conf) for the endpoint. The culprit is the incorrect endpoint entered while creating the foreign server. pgfdwsource=> select * from customer1_fdw; ERROR: could not connect to server "my_fdw_target" DETAIL: could not translate host name "pgfdwtarget.xx.us-west-2.rds.amazonaws.com" to address: Name or service not known Table not present – In this scenario, the table notpresent was not in the target database instance. Therefore, the error message ERROR: relation "public.notpresent" does not exist The common culprit is a non-existent or dropped table. pgfdwsource=> create foreign table customer2_fdw( id int, name varchar, emailid varchar, projectname varchar, contactnumber bigint) server my_fdw_target OPTIONS( TABLE_NAME 'notpresent'); CREATE FOREIGN TABLE pgfdwsource=> select * from customer2_fdw; ERROR: relation "public.notpresent" does not exist CONTEXT: remote SQL command: SELECT id, name, emailid, projectname, contactnumber FROM public.notpresent User permission – In this scenario, a permission denied error occurs when setting up user mapping, saying that you don’t have permission over the schema test. Common culprits are lack of select permission on the schema or table for the user specified in user mappings. pgfdwsource=> create foreign table customer1_fdw( id int, name varchar, emailid varchar, projectname varchar, contactnumber bigint) server my_fdw_target OPTIONS( SCHEMA_NAME 'test', TABLE_NAME 'newtable'); CREATE FOREIGN TABLE pgfdwsource=> select * from customer1_fdw; ERROR: permission denied for schema test Additional capabilities In addition to SELECT, postgres_fdw allows you to run UPDATE, INSERT, and DELETE on a foreign table. The following code updates the foreign table customer1_fdw (note Foreign Update in the query plan): pgfdwsource=> explain update customer1_fdw set name='Jeff' where name='Jerry'; QUERY PLAN ------------------------------------------------------------------------------- Update on customer1_fdw (cost=100.00..119.73 rows=4 width=114) -> Foreign Update on customer1_fdw (cost=100.00..119.73 rows=4 width=114) (2 rows) pgfdwsource=> select * from customer1_fdw; id | name | emailid | projectname | contactnumber ----+-------+---------------------+---------------------+--------------- 1 | Tom | [email protected] | Customer1.migration | 328909432 2 | Harry | [email protected] | Customer1.etl | 2328909432 3 | Jerry | [email protected] | Customer1.infra | 328909432 (3 rows) pgfdwsource=> update customer1_fdw set name='Jeff' where name='Jerry'; UPDATE 1 pgfdwsource=> select * from customer1_fdw; id | name | emailid | projectname | contactnumber ----+-------+---------------------+---------------------+--------------- 1 | Tom | [email protected] | Customer1.migration | 328909432 2 | Harry | [email protected] | Customer1.etl | 2328909432 3 | Jeff | [email protected] | Customer1.infra | 328909432 (3 rows) postgres_fdw come with some small optimizations, such as FDW maintaining the table schema locally. As a result, rather than doing a SELECT * to pull all the data from a table in remote server, the FDW sends query WHERE clauses to the remote server to run, and doesn’t retrieve table columns that aren’t needed for the current query. It can also push down joins, aggregates, sorts, and more. postgres_fdw is quite powerful; this post covers just the basic functionality. For more information about these optimizations, see Appendix F. Additional Supplied Modules. Summary This post illustrated what you can achieve with federated query capability and postgres_fdw in Amazon RDS for PostgreSQL or Aurora PostgreSQL to query data from a PostgreSQL-based remote server, both managed and self-managed on Amazon EC2 or on premises. We encourage you to use the postgres_fdw extension of community PostgreSQL in your Amazon RDS for PostgreSQL and Aurora PostgreSQL environments to query data from multiple databases in remote PostgreSQL-based servers like Amazon Redshift, Amazon RDS for PostgreSQL, Aurora PostgreSQL, self-managed PostgreSQL servers on Amazon EC2 or on premises, or from the same server in real time without the need for setting up ongoing replication or a data pipeline. About the Authors Vibhu Pareek is a Solutions Architect at AWS. He joined AWS in 2016 and specializes in providing guidance on cloud adoption through the implementation of well architected, repeatable patterns and solutions that drive customer innovation. He has keen interest in open source databases like PostgreSQL. In his free time, you’d find him spending hours on automobile reviews, enjoying a game of football or engaged in pretentious fancy cooking.     Gaurav Sahi is a Principal Solutions Architect based out of Bangalore, India and has rich experience across embedded, telecom, broadcast & digital video and cloud technologies. In addition to helping customers in their transformation journey to cloud, his current passion is to explore and learn AI/ML services. He likes to travel, enjoy local food and spend time with family and friends in his free time. https://aws.amazon.com/blogs/database/federated-query-support-for-amazon-aurora-postgresql-and-amazon-rds-for-postgresql/
0 notes
globalmediacampaign · 5 years ago
Text
Migrating legacy PostgreSQL databases to Amazon RDS or Aurora PostgreSQL using Bucardo
If you are using PostgreSQL earlier than 9.4, you are using an unsupported version of PostgreSQL, and may have limited options to migrate or replicate your databases in Amazon RDS or Amazon Aurora PostgreSQL. This is primarily because PostgreSQL versions older than 9.4 can’t perform logical replication. Bucardo is an open-source utility that can replicate data changes asynchronously to multiple secondary or multiple masters. It is a trigger-based replication and proven to be consistent and stable for more extensive migrations and ongoing replications. Bucardo can perform full load for tables without a primary key. However, to replicate delta data changes from Primary, create a primary key before you start the setup. This post demonstrates how to set up Bucardo and replicate data changes from PostgreSQL 8.4 to PostgreSQL 9.6. Prerequisites Before getting started, you must have the following: One EC2 instance with Ubuntu 16.04 for Bucardo (Bucardo Server: 172.31.88.4) One EC2 instance with RHEL 6 with PostgreSQL 8.4.2 (PostgreSQL 8.4.2: 172.31.16.177) One RDS PostgreSQL 9.6 in us-east-1 (RDS 9.6) This post uses PostgreSQL 8.4.2 on Amazon EC2; however, the PostgreSQL database might be running on-premises. This solution installs Bucardo 5.4.1 on Ubuntu 16.04, which means that the repository for Bucardo is on the same host running on a PostgreSQL 9.6 instance. The following diagram shows the architecture of the data replication flow. Fig: Replication Architecture to migrate PostgreSQL 8.4 to RDS PostgreSQL 9.6 using Bucardo. Installing Bucardo binaries There are several packages that you must install before installing Bucardo. See the following code: #apt-get install postgresql-plperl-9.6 libdbd-pg-perl libboolean-perl build-essential libdbd-mock-perl libdbd-pg-perl libanyevent-dbd-pg-perl libpg-hstore-perl libpgobject-perl Connect to CPAN and install DBI,DBD::Pg,DBIx::Safe. See the following code: cpan > install DBI cpan > install DBD::Pg cpan > install DBIx::Safe Download the Bucardo binaries into your local directory and untar. See the following code: $wget http://bucardo.org/downloads/Bucardo-5.4.1.tar.gz tar xvfz Bucardo-5.4.1.tar.gz $perl Makefile.PL $sudo make install Creating superusers and the repository database You must create a Bucardo superuser and repository database to control and track the replications between environments. Connect to DB-APP1 using the PSQL client or pgadmin4 and create the superuser and repository on DR-App1. See the following code: postgres=# create user bucardo superuser; CREATE ROLE postgres=# create database bucardo; CREATE DATABASE postgres=# du List of roles Role name | Attributes | Member of -----------+------------------------------------------------------------+----------- bucardo | Superuser | {} postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {} replica | Replication | {} postgres=# alter database bucardo owner to bucardo; ALTER DATABASE postgres=# list List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+-------------+-------------+----------------------- bucardo | bucardo | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres (4 rows) After you create the superuser and repository database, exit from PSQL and run “bucardo install” from the terminal where Bucardo software is staged. This creates a set of tables in the Bucardo database (the database owner should be a Bucardo superuser). Installing the Bucardo repository To install the Bucardo repository, enter the following code: postgres@ip-172-31-88-4:~/Bucardo-5.4.1$ sudo bucardo install This installs the Bucardo database into an existing PostgreSQL cluster. You must have compiled PostgreSQL with Perl support, and connect as a superuser. See the following code: Current connection settings: 1. Host: 2. Port: 5432 3. User: postgres 4. Database: bucardo 5. PID directory: /var/run/bucardo Enter a number to change it, P to proceed, or Q to quit: p Postgres version is: 9.6 Attempting to create and populate the bucardo database and schema Database creation is complete Updated configuration setting "piddir" Installation is now complete. If you see errors or need help, you can contact Bucardo support for assistance at [email protected]. You may want to check over the configuration variables next. See the following code: bucardo show all Change any setting by using: bucardo set foo=bar postgres@ip-172-31-88-4:~/Bucardo-5.4.1$ Whitelisting Bucardo and the PostgreSQL database server to connect with each other Use pgpass to set up passwordless authentication to connect the source and target databases securely. On the Bucardo server, enter the following code: postgres@ip-172-31-88-4:~$ touch ~/.pgpass postgres@ip-172-31-88-4:~$ chmod 0600 ~/.pgpass postgres@ip-172-31-88-4:~$ cat ~/.pgpass #server:port:database:username:password 127.0.0.1:5432:postgres:postgres:XXXXXX 172.31.88.4:5432:bucardo:postgres:XXXXXX 172.31.16.177:5432:repdb:postgres:XXXXXX pgrds.cxad2e11vriv.us-east-1.rds.amazonaws.com:5432:repdb:postgres:XXXXXX Verify that the Bucardo server can connect the source and target databases without a password with the following code: postgres@ip-172-31-88-4:~$ psql -h 172.31.16.177 -d repdb -U postgres -w -c "select count(*) from pgbench_branches" count ------- 1 (1 row) postgres@ip-172-31-88-4:~$ psql --host 'pgrds.cxad2e11vriv.us-east-1.rds.amazonaws.com' --port 5432 --username 'postgres' 'repdb' -w -c "select count(*) from pgbench_branches" count ------- 1 (1 row) Resolving a permission denied error Because RDS is a managed service, AWS doesn’t provide superuser privileges for security reasons. To perform a trigger-based replication, you must enable the parameter session_replication_role. You can use the security definer function rds_session_replication_role, which helps you to set the parameter to replica when an event occurs. To be consistent across all environments, this post creates the security definer function in EC2 PostgreSQL (8.4.2) and RDS. Create language plpgsql; with the following code: CREATE OR REPLACE FUNCTION public.rds_session_replication_role(role text) RETURNS text LANGUAGE plpgsql SECURITY DEFINER AS $function$ DECLARE curr_val text := 'unset'; BEGIN EXECUTE 'SET session_replication_role = ' || quote_literal(role); EXECUTE 'SHOW session_replication_role' INTO curr_val; RETURN curr_val; END $function$; postgres=> revoke all on function rds_session_replication_role(text) from public; REVOKE postgres=> grant execute on function rds_session_replication_role(text) to rds_superuser; GRANT postgres=> grant rds_superuser to postgres; GRANT Also, make changes to the bucardo.pm file at lines 5397 and 5330 with the following code: $dbh->do(q{select rds_session_replication_role('replica');}); ## Assumes a sane default ! From. $dbh->do(q{SET session_replication_role = default}); ## Assumes a sane default ! Alternatively, you can download the updated bucardo.pm file and move the file to the server where Bucardo is running under the location /usr/local/share/perl/5.22.1/Bucardo.pm. If you are running in production, please test it before using it. Generating a sample source database and initiating target full load On the source database, generate some test data using pgbench. This post generates four tables, three with the primary key enabled and one without the primary key enabled. See the following code: postgres=# create database repdb; CREATE DATABASE The following code is the generated sample data in repdb: [postgres@ip-172-31-16-177 ~]$ pgbench -i repdb NOTICE: table "pgbench_branches" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_history" does not exist, skipping creating tables... 10000 tuples done. 20000 tuples done. 30000 tuples done. 40000 tuples done. 50000 tuples done. 60000 tuples done. 70000 tuples done. 80000 tuples done. 90000 tuples done. 100000 tuples done. set primary key... NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "pgbench_branches_pkey" for table "pgbench_branches" NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "pgbench_tellers_pkey" for table "pgbench_tellers" NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "pgbench_accounts_pkey" for table "pgbench_accounts" vacuum...done. Verify the data count and table structures. See the following code: repdb=# select count(*) from pgbench_accounts; 100000 repdb=# select count(*) from pgbench_branches; 1 repdb=# select count(*) from pgbench_history; 0 repdb=# select count(*) from pgbench_tellers; 10 Migrating repdb from the source database using pg_dump and pg_restore Back up the source database using pg_dump. See the following code: postgres@ip-172-31-88-4:~$ pg_dump -Fc -v -h ec2-34-229-97-46.compute-1.amazonaws.com -U postgres repdb -w > repdb_bkp1.dump pg_dump: last built-in OID is 16383 pg_dump: reading extensions pg_dump: identifying extension members …… Log in to RDS PostgreSQL and create the database repdb. See the following code: postgres@ip-172-31-88-4:~$ psql --host 'pgrds.cxad2e11vriv.us-east-1.rds.amazonaws.com' --port 5432 --username 'postgres' 'postgres' Password for user postgres: psql (9.6.15) SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off) Type "help" for help. postgres=> create database repdb; CREATE DATABASE Restore the dump file generated in the newly created repdb in RDS PostgreSQL using pg_restore. See the following code: postgres@ip-172-31-88-4:~$ pg_restore -v -h pgrds.cxad2e11vriv.us-east-1.rds.amazonaws.com -U postgres -d repdb repdb_bkp1.dump pg_restore: connecting to database for restore Password: pg_restore: creating SCHEMA "public" pg_restore: creating COMMENT "SCHEMA public" pg_restore: creating PROCEDURAL LANGUAGE "plpgsql" For more information, see Importing Data into PostgreSQL on Amazon RDS. Configuring Bucardo to replicate tables with a primary key A typical Bucardo setup consists of steps to add the source and target databases, add tables with a primary key to the group, and create and enable the sync to start replicating the changes from source. To add the source database, enter the following code: postgres@ip-172-31-88-4:~$ bucardo add db pgdb84 dbhost=ec2-34-229-97-46.compute-1.amazonaws.com dbport=5432 dbname=repdb dbuser=postgres Added database "pgdb84" To add the target RDS database, enter the following code: postgres@ip-172-31-88-4:~$ bucardo add db rds96 dbhost=pgrds.cxad2e11vriv.us-east-1.rds.amazonaws.com dbport=5432 dbname=repdb dbuser=postgres dbpass=postgres123 Added database "rds96" To add tables to the herd, enter the following code: postgres@ip-172-31-88-4:~$ bucardo add table pgbench_accounts pgbench_branches pgbench_tellers herd=herd_pg84 db=pgdb84 Created the relgroup named "herd_pg84" The following tables or sequences are now part of the relgroup "herd_pg84": public.pgbench_accounts public.pgbench_branches public.pgbench_tellers To add the database group, enter the following code: postgres@ip-172-31-88-4:~$ bucardo add dbgroup pgdb84_to_rds96 pgdb84:source rds96:target Created dbgroup "pgdb84_to_rds96" Added database "pgdb84" to dbgroup "pgdb84_to_rds96" as source Added database "rds96" to dbgroup "pgdb84_to_rds96" as target postgres@ip-172-31-88-4:~$ bucardo add sync sync_pg84_rds96 relgroup=herd_pg84 db=pgdb84,rds96 Added sync "sync_pg84_rds96" Using existing dbgroup "dbgrp84_96" You can have multiple databases in a particular database group. Check the Bucardo sync status before you start to make sure that you see the parameters created. See the following code: postgres@ip-172-31-88-4:~$ sudo bucardo status sync_pg84_rds96 [sudo] password for postgres: ====================================================================== Sync name : sync_pg84_rds96 Current state : No records found Source relgroup/database : herd_pg84 / pgdb84 Tables in sync : 3 Status : Active Check time : None Overdue time : 00:00:00 Expired time : 00:00:00 Stayalive/Kidsalive : Yes / Yes Rebuild index : No Autokick : Yes Onetimecopy : No Post-copy analyze : Yes Last error: : ====================================================================== Start Bucardo and verify its status. See the following code: postgres@ip-172-31-88-4:~$ sudo bucardo start Checking for existing processes Removing file "/var/run/bucardo/fullstopbucardo" Starting Bucardo postgres@ip-172-31-88-4:~$ sudo bucardo status sync_pg84_rds96 ====================================================================== Last good : Dec 05, 2019 08:30:03 (time to run: 1s) Rows deleted/inserted : 0 / 0 Sync name : sync_pg84_rds96 Current state : Good Source relgroup/database : herd_pg84 / pgdb84 Tables in sync : 3 Status : Active Check time : None Overdue time : 00:00:00 Expired time : 00:00:00 Stayalive/Kidsalive : Yes / Yes Rebuild index : No Autokick : Yes Onetimecopy : No Post-copy analyze : Yes Last error: : ====================================================================== The Current State is Good and no inserts, updates, and deletes are happening currently in the source database. To test the replication, generate a test load in the source database using pgbench and monitor changes on the target. See the following code: [postgres@ip-172-31-16-177 ~]$ pgbench -t 10000 repdb starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 1 number of transactions per client: 10000 number of transactions actually processed: 10000/10000 tps = 503.183795 (including connections establishing) tps = 503.244214 (excluding connections establishing) After you run pgbench, it generates some transactions, but Bucardo can’t move to the target due to a permission issue. Therefore, the status of Current State is Bad. See the following code: postgres@ip-172-31-88-4:~$ sudo bucardo status sync_pg84_rds96 ====================================================================== Last bad : Dec 05, 2019 08:32:54 (time until fail: 1s) Sync name : sync_pg84_rds96 Current state : Bad Source relgroup/database : herd_pg84 / pgdb84 Tables in sync : 3 Status : Active Check time : None Overdue time : 00:00:00 Expired time : 00:00:00 Stayalive/Kidsalive : Yes / Yes Rebuild index : No Autokick : Yes Onetimecopy : No Post-copy analyze : Yes Last error: : Failed : DBD::Pg::db do failed: ERROR: function rds_session_replication_role(unknown) does not exist LINE 1: select rds_session_replication_role('replica'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. at /usr/local/share/perl/5.22.1/Bucardo.pm line 5328. Line: 5041 Main DB state: ? Error: none DB pgdb84 state: ? Error: none DB rds96 state: 42883 Error: 7 (KID 14864) ====================================================================== If you encounter this error, follow the steps to resolve a permission denied error. In this example, security definer functions were not created in source and target databases and caused the preceding error. After implementing the security definer, restart Bucardo. See the following code: postgres@ip-172-31-88-4:~$ sudo bucardo restart Creating /var/run/bucardo/fullstopbucardo ... Done Checking for existing processes Removing file "/var/run/bucardo/fullstopbucardo" Starting Bucardo The Current State is now Good, and 294 deletes and inserts happened in the database. This confirms that your Bucardo is healthy. You can ignore the entry for Last error. See the following code: postgres@ip-172-31-88-4:~$ sudo bucardo status sync_pg84_rds96 ====================================================================== Last good : Dec 05, 2019 08:51:21 (time to run: 1s) Rows deleted/inserted : 294 / 294 Last bad : Dec 05, 2019 08:35:06 (time until fail: 1s) Sync name : sync_pg84_rds96 Current state : Good Source relgroup/database : herd_pg84 / pgdb84 Tables in sync : 3 Status : Active Check time : None Overdue time : 00:00:00 Expired time : 00:00:00 Stayalive/Kidsalive : Yes / Yes Rebuild index : No Autokick : Yes Onetimecopy : No Post-copy analyze : Yes Last error: : Failed : DBD::Pg::db do failed: ERROR: function rds_session_replication_role(unknown) does not exist LINE 1: select rds_session_replication_role('replica'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. at /usr/local/share/perl/5.22.1/Bucardo.pm line 5328. Line: 5041 Main DB state: ? Error: none DB pgdb84 state: ? Error: none DB rds96 state: 42883 Error: 7 (KID 15006) ====================================================================== To debug the replication, Bucardo logs are located in the /var/log directory. See the following code: postgres@ip-172-31-88-4:~$ tail -f /var/log/bucardo/log.bucardo (15337) [Thu Dec 5 08:51:25 2019] KID (sync_pg84_rds96) Delta count for pgdb84.public.pgbench_tellers : 10 (15337) [Thu Dec 5 08:51:25 2019] KID (sync_pg84_rds96) Totals: deletes=297 inserts=297 conflicts=0 (15337) [Thu Dec 5 08:51:25 2019] KID (sync_pg84_rds96) Delta count for pgdb84.public.pgbench_accounts : 109 (15337) [Thu Dec 5 08:51:25 2019] KID (sync_pg84_rds96) Delta count for pgdb84.public.pgbench_branches : 1 (15337) [Thu Dec 5 08:51:25 2019] KID (sync_pg84_rds96) Delta count for pgdb84.public.pgbench_tellers : 10 (15337) [Thu Dec 5 08:51:25 2019] KID (sync_pg84_rds96) Totals: deletes=120 inserts=120 conflicts=0 (15337) [Thu Dec 5 08:51:26 2019] KID (sync_pg84_rds96) Delta count for pgdb84.public.pgbench_accounts : 239 (15337) [Thu Dec 5 08:51:26 2019] KID (sync_pg84_rds96) Delta count for pgdb84.public.pgbench_branches : 1 (15337) [Thu Dec 5 08:51:26 2019] KID (sync_pg84_rds96) Delta count for pgdb84.public.pgbench_tellers : 10 (15337) [Thu Dec 5 08:51:26 2019] KID (sync_pg84_rds96) Totals: deletes=250 inserts=250 conflicts=0 Conclusion This post demonstrated the complete solution to overcome the challenge of migrating legacy PostgreSQL databases older than 9.4 to Amazon RDS PostgreSQL or Aurora PostgreSQL by using the asynchronous trigger-based replication utility Bucardo. If you have comments or questions about this solution, please submit them in the comments section.   About the Authors   Rajeshkumar Sabankar is a Database Specialty Architect with Amazon Web Services. He works with internal Amazon customers to build secure, scalable, and resilient architectures in AWS Cloud and help customers perform migrations from on-premise databases to Amazon RDS and Aurora Databases.       Samujjwal Roy is a Database Specialty Architect with the Professional Services team at Amazon Web Services. He has been with Amazon for 15+ years and has led migration projects for internal and external Amazon customers to move their on-premises database environment to AWS Cloud database solutions.     https://probdm.com/site/MTk1Mzg
0 notes
globalmediacampaign · 5 years ago
Text
Migration tips for developers converting Oracle and SQL Server code to PostgreSQL
PostgreSQL is one of the most popular open-source relational database systems. It is considered to be one of the top database choices when customers migrate from commercial databases such as Oracle and Microsoft SQL Server. AWS provides two managed PostgreSQL options: Amazon RDS and Amazon Aurora. In addition to providing managed PostgreSQL services, AWS also provides tools and resources to help with migration. AWS Schema Conversion Tool (SCT) is a free AWS tool that helps you convert your existing schemas and supports several source and target databases. AWS also provides AWS Database Migration Service (DMS), which helps transfer and continuously replicate data between heterogeneous and homogenous databases. Similarly, AWS provides migration playbooks that document a large number of functional mappings between commercial databases and open-source databases such as PostgreSQL. This post provides tips and best practices for converting code from PL/SQL to PL/pgSQL, which can help achieve better performance and code conversions to PostgreSQL. This post is targeted for developers working on database migrations and assumes that the readers have a basic knowledge of databases and PL/SQL. Performance considerations This section provides some of the factors that influence PostgreSQL performance improvements while migrating from commercial or legacy databases such as SQL Server or Oracle. Most of the databases have similar objects, but considering the right object, changes the behavior of the system. This section explains how to achieve better performance with stored procedures, functions, and SQL statements. Data types To avoid re-work, correctly map the data types in the target database to the source system before starting the project. The following table summarizes some common data type mapping from Oracle and SQL Server to PostgreSQL. Oracle PostgreSQL SQL Server Notes Number Small Integer Tinyint / Smallint Generally for lookup tables whose values of the table are limited. Number Integer / Bigint Integer / Bigint Number Double Precision / Float / Numeric Double Precision / Float / Numeric For the financial domain in which you want an application to store high precision value, you can configure it as numeric/decimal. Otherwise, double precision or float may be sufficient. Varchar Char(n) Varchar(n) Varchar Text Character varying Nchar Nvarchar Ntext Timestamp(6) Timestamp without timezone DateTime2(p) DateTime Clob Text Blob Raw Bytea Binary, Image, VarBinary Boolean Boolean Bit XML XML XML Why number to smallint/integer/bigint and not numeric? To get the best performance from your database, it is important to use optimal data types. If your table column must hold a maximum of a four-digit number, the column data type with 2 (smallint) bytes is sufficient, rather than defining 4 (integer/real), 8 (bigint/double precision), or variable (numeric) byte data types. Numeric is a complex type that can hold 131,000 digits and is recommended for storing monetary amounts and other quantities for which exactness is required. However, calculations on numeric values are very slow compared to the integer types or floating-point types, because its operators are slow. The following table gives an example of how the size of a table grows for a single column when you compare numeric size with smallint/int/bigint for non-precision columns, excluding indexes. TN Size External size Value inserted numericsize 16 KB 8192 bytes Insert into numericsize value (1234678) smallintsize 8192 bytes 0 bytes Insert into numericsize value (1234) intsize 8192 bytes 0 bytes Insert into numericsize value (123457) bigintsize 8192 bytes 0 bytes Insert into numericsize value (123486) The following table uses the same information as the previous table, but includes indexes. For this table, size refers to the total size of the table, and external size is the size of related objects such as indexes. TN Size External size numericsize 32 KB 24 KB smallintsize 24 KB 16 KB intsize 24 KB 16 KB bigintsize 24 KB 16 KB AWS SCT maps number to numeric data type for tables without knowing the actual data size. This tools have an option to configure/map to right data type while conversion. Procedures and functions PostgreSQL 10 and older versions do not have procedures support. All the procedures and functions from Oracle and SQL Server are mapped to functions in PostgreSQL. The procedures are supported in PostgreSQL, starting from version 11, and are similar to Oracle. PostgreSQL supports three volatility categories of functions and you must use the appropriate type based on the functionality while migrating: Volatile, Stable, and Immutable. Marking the function type appropriately could be an important performance tweak. Volatile The Volatile type indicates that the function value can change even within a single table scan, so that no optimizations can be made. Relatively few database functions are volatile; some examples are random(), currval(), and timeofday(). Any function that has side effects must be classified as volatile, even if its result is predictable, to prevent calls from being optimized away, one example is setval(). If the volatility type is not provided during function creation, all new functions are marked as volatile by default. Below is a sample function to show the time taken to execute the Volatile function. Create Or Replace Function add_ten_v(num int) Returns integer AS $$ Begin Perform pg_sleep(0.01); Return num + 10; End $$ Language 'plpgsql' Volatile; Execute the function below to see the cost of the function. lab=>Explain Analyze Select add_ten_v(10)FROM generate_series(1,100,1); Query plan ----------------------------------------------------------------------------- Function Scan on generate_series (cost=0.00..260.00 rows=1000 width=4) (actual time=10.200..1015.461 rows=100 loops=1) Planning time: 0.030 ms Execution time: 1015.501 ms (3 rows) Time: 1016.313 ms Stable The Stable type indicates that the function cannot modify the database. It also indicates that within a single table scan, it consistently returns the same result for the same argument values, but that its result could change across SQL statements. This is the appropriate selection for functions whose results depend on database lookups or parameter variables, such as the current time zone. The current_timestamp family of functions qualifies as stable, because their values do not change within a transaction. Below is a sample function to show the time taken to execute the Stable function. Create Or Replace Function add_ten_s(num int) Returns integer AS $$ Begin Perform pg_sleep(0.01); Return num + 10; End $$ Language 'plpgsql' Stable; Execute the function below to see the cost of the function. lab=> Explain Analyze Select add_ten_s(10) From generate_series(1,100,1); Query Plan ------------------------------------------------------------------------------------------------------------------------- Function Scan on generate_series (cost=0.00..260.00 rows=1000 width=4) (actual time=10.153..1013.814 rows=100 loops=1) Planning time: 0.031 ms Execution time: 1013.846 ms (3 rows) Time: 1014.507 ms Immutable The Immutable type indicates that the function cannot modify the database and always returns the same result when given the same argument values. This means it does not do database lookups or otherwise use information not directly present in its argument list. If this option is given, any call of the function with all-constant arguments can be immediately replaced with the function value. Below is the sample function to show the time taken to execute the Immutable function. Create Or Replace Function add_ten_i(num int) Returns integer AS $$ Begin Perform pg_sleep(0.01); Return num + 10; End $$ Language 'plpgsql' Immutable; Execute the function below to see the cost of the function. lab=> Explain Analyze Select Add_Ten_I(10) From Generate_Series(1,100,1); Query Plan -------------------------------------------------------------------------------------------------------------------- Function Scan on generate_series (cost=0.00..10.00 rows=1000 width=4) (actual time=0.009..0.016 rows=100 loops=1) Planning time: 10.185 ms Execution time: 0.030 ms (3 rows) Time: 10.681 ms All of these functions return the following value: lab=> Select Add_Ten_V(10), Add_Ten_S(10), Add_Ten_I(10); add_ten_v | add_ten_s | add_ten_i -----------+-----------+----------- 20 | 20 | 20 (1 row) Though all of the above functions deliver the same value, you may need to use any of these three function types, depending on the functionality, to achieve better performance. The test run of each of these functions shows that the functions contain the same functionality, but the Immutable variant takes the minimum amount of time. This is because this category allows the optimizer to pre-evaluate the function during the query calls with constant arguments. Function calls in views and queries Many applications use views and queries that contain function calls. As discussed in the previous section, in PostgreSQL, this can be a costly operation, especially if the function volatility category is not set correctly. In addition to this, the function call itself adds to the query cost. Choose the appropriate volatility for your function based on what the function does. If your function truly is Immutable or Stable, setting it instead of using the default of Volatile could give you some performance advantages. The following example code is a query with the Volatile function call. Explain Analyze Select Empid, Empname, Getdeptname(Deptid), Salary, Doj, Address From Emp Where Deptid=2 The function getDeptname() is marked as volatile. The total runtime for the query is 2 seconds and 886 milliseconds. The following example code is a query with the Stable function call. Explain Analyze Select Empid, Empname, Getdeptnames(Deptid), Salary, Doj, Address From Emp Where Deptid=2 The function getDeptname() is marked as stable. The total runtime for the query is 2 seconds and 644 milliseconds. The following example code replaces the function call with functionality. Explain Analyze Select Empid, Empname, Deptname, Salary, Doj, Address From Emp E Join Dept D On D.Deptid = E.Deptid Where E.Deptid=2 The function logic is moved to the query successfully. The total runtime for the query is 933 milliseconds. Optimizing exceptions PostgreSQL provides the functionality to trap and raise errors using the Exception and Raise statements. This is a useful functionality, but it comes at a cost. Raise statements raise errors and exceptions during a PL/pgSQL function’s operation. By default, any error occurrence inside a PL/pgSQL function causes the function to abort the execution and roll back the changes. To recover from errors, PL/pgSQL can trap the errors using the Exception clause. For this functionality, PostgreSQL has to save the state of the transaction before entering the block of code with exception handling. This is an expensive operation, so it adds an overhead cost. To avoid this overhead, it is recommended to either have the exceptions catching at the application side, or make sure that the required validation is in place so that the function never causes an exception. The following code example demonstrates the performance impact of having an exception in a function call. Create Or Replace Function empsal (eid int) Returns Integer AS $total$ Declare Total Integer; Begin Update Emp Set Salary = Salary * 0.20 Where Empid = Eid; Return 1; End; $$ Total Language Plpgsql; Create Or Replace Function Empsalexcep (Eid Int) Returns Integer AS $Total$ Declare Total Integer; Begin Update Emp Set Salary = Salary * 0.20 Where Empid = Eid; RETURN 1; Exception When Others Then Raise Notice 'Salary Update Failed '; END; $$ Total Language Plpgsql; Select * From Empsal(3) – 78ms -- without exception handling Select * From Empsalexcep(3) - 84ms -- with exception handling If you can’t verify without an exception, the exception is clearly required. In the preceding example, you can check the diagnostics to see if there is a change that was taken care of. It is a good practice to avoid using exception handling if possible. Counter not required for fetch operation Many applications get the total count and loop through the cursor to fetch the records. Because the fetch operation returns null when there are no records, it is better to use the fetch status rather than looping through the count by declaring another two variables and checking the count. You can avoid declaring extra variables and checking incremental values by reducing the number of statements to execute and achieve better performance. See the following code as an example. Select Count(1) Into Count_Value From Tab1 Where Tab1.A = Value Counter = 0 Open Dvscriptcursor For Select Id From Tab1; While (Counter < Count_Value) Loop Fetch Id Into Var_Id …….. ……. Counter = Counter +1; End Loop You can also rewrite this code as I’ve done below. This helps avoid declaring two variables and uses cursor itself to iterate and cursor status to break/exit the loop. OPEN Dvscriptcursor For Select Id From Tab1; Loop Fetch Id Into Var_Id Exit When Not Found …….. ……. ……. End Loop Check with EXISTS rather than count In legacy applications, SQL queries are written to find the count of records that match, and then applies the required business logic. If table has billions of records, then getting the record count can be costly. The code sample below demonstrates how to check the count of rows and then update the data. Create Or Replace Function Empsal (Eid Int) Returns Integer As $Total$ Declare Total Integer; Begin If (Select Count(*) From Emp Where Empid = Eid) > 0 Then -- Wrong Usage Update Emp Set Salary = Salary * 0.20 Where Empid = Eid ; End If; Return 1; End; $Total$ Language plpgsql; The total runtime of the query is 163 milliseconds. This code can also be re-written to check for one column rather an entire row, which is can be more cost and performance effective. See the sample code below. Create Or Replace Function Empsal (Eid Int) Returns Integer AS $Total$ Declare Total Integer; Begin If Exists (Select 1 From Emp Where Empid = Eid) Then. – Right Usage Update Emp Set Salary = Salary * 0.20 Where Empid = Eid ; End If; RETURN 1; END; $$ Total Language plpgsql; The total runtime of the query is 104 milliseconds. Record count after DML statements In most of the legacy applications, the record count indicates if there are any changes for the data manipulation statements. In PostgreSQL, this information is maintained in its statistics and can be retrieved to avoid the count of the values after the operation. Use diagnostics to retrieve the number of rows affected, as shown in the code sample below. Create Or Replace Function Empsal (Eid Int) Returns Integer AS $Total$ Declare Total Integer; Rows_Affected Int; Begin If Exists (Select 1 From Emp Where Empid = Eid) Then Update Emp Set Salary = Salary * 0.20 Where Empid = Eid ; Get Diagnostics Rows_Affected = ROW_COUNT; End If; RETURN 1; END; $$ Total Language plpgsql; Pattern match and search It’s common practice to use the wildcard character % or _ with the LIKE (or ILIKE for insensitive searches) expression while retrieving data from tables. If the wildcard character is at the start of the given pattern, the query planner can’t use an index even if an index exists. In this case, a sequential scan is used, which is a time-consuming operation. To get better performance with millions of records and make the query planner use the available indexes, use the wildcard character either in the middle or at the end rather than in the beginning of the predicate. This forces the planner to use indexes. In addition to the LIKE expression, you can also use the pg_trgm module/extension for pattern matching. The pg_trgm module provides functions and operators that you can use to determine the similarity of alphanumeric text. It also provides index operator classes that support fast searching for similar strings. For more information, see the pg_trgm documentation on the PostgreSQL website. Conversion mapping between Oracle, SQL Server, and PostgreSQL This section provides database specific comparisons while writing SQL statements across Oracle, SQL Server, and PostgreSQL databases. Default FROM clause For Oracle, the FROM clause is mandatory, in which case you would use the code Select 1 from Dual;. For PostgreSQL and SQL, it is optional to use the code Select 1;. Generating a series of values You can generate a series of values from the start to the end number. In Oracle, you don’t need a starting number, but can give an end number. See the following code as an example. Select Rownum As Rownum_Value From Dual Connect By Level <= 64 When using a start and end number, use the following code. With t(n) As ( Select 1 from dual Union All Select n+1 From t Where n < 64 ) Select * From t; In PostgreSQL, use the following code. Select Generate_Series(1,64) AS Rownum_Value In SQL Server, use the following code. ;With n(n) As ( Select 1 Union All Select n+1 From n Where n < 64 ) Select n From n Order By n Join with (+) operator In Oracle, for a left join, use the following code. Select b.id, b.title, b.author, b.year_published, l.name language From books b, ibrary.languages l Where l.id (+)= b.language_id Order By b.id For a right join, use the following code. Select b.id, b.title, b.author, b.year_published, l.name language From books b, ibrary.languages l Where l.id = b.language_id (+) Order BY b.id For more information, see SQL for Beginners (Part 5): Joins on the Oracle database site. There is no feature called “+” in PostgreSQL or SQL Server to do a left or right join to the tables. Instead, use the following two queries. Select b.id, b.title, b.author, b.year_published, l.name language From books b, Left join ibrary.languages l On l.id = b.language_id Order BY b.id Select b.id, b.title, b.author, b.year_published, l.name language From books b, Right join ibrary.languages l On l.id = b.language_id Order By b.id Type as a parameter to functions In SQL Server, you can pass multiple records with the Type data type. To implement the same in PostgreSQL, you can use it as a JSON or text data type in JSON format or array. The following example code is with text data type in JSON format with multiple records. You can insert it into a temporary table and process it further with the following code. Create Table emptable1 ( empid integer, last_name varchar(100), first_name varchar(100), deptid integer, salary double precision ) Oracle The following code shows how multiple records can be passed in the varchar data type in Oracle. DECLARE StructType Varchar2(1000) Default '[{"empid" : 1, "last_name":"AccName1", "first_name":"AccName1", "deptid":"1", "salary":"1234.578"} ,{"empid" : "2", "last_name":"AccName2", "first_name":"AccName2", "deptid":"2", "salary":"4567.578"} ]'; Begin Insert Into emptable1 (empid,last_name,first_name,deptid,salary) With Json As ( Select StructType --'[{"firstName": "Tobias", "lastName":"Jellema"},{"firstName": "Anna", "lastName":"Vink"} ]' doc from dual ) Select empid,last_name,first_name,deptid,salary From json_table( (Select StructType from json) , '$[*]' Columns ( empid PATH '$.empid' ,last_name Path '$.last_name' , first_name Path '$.first_name' ,deptid Path '$.deptid' ,salary Path '$.salary' ) ); End; SQL Server The following code shows how multiple records can be passed in table type in SQL Server for the same functionality given above in Oracle. --Create Type structure Create Type empTableType as Table ( empid integer, last_name varchar(100), first_name varchar(100), deptid integer, salary double precision ); --Create Procedure Create Procedure InsertEmpTable @InsertEmpt_TVP empTableType READONLY As Insert Into emptable1(empid,last_name,first_name,deptid,salary) Select * FROM @InsertEmpt_TVP; Go --Calling the SP with dynamic block and type Declare @EmpTVP AS empTableType; Insert Into @EmpTVP(empid,last_name,first_name,deptid,salary) Values (1,'FirstName','Last_name',1,1234.566), (2,'FirstName','Last_name',1,1234.566), (3,'FirstName','Last_name',1,1234.566), (4,'FirstName','Last_name',1,1234.566), (5,'FirstName','Last_name',1,1234.566); Exec InsertEmpTable @EmpTVP; Go PostgreSQL The following code shows how multiple records can be passed in as text type in PostgreSQL for the same functionality given above in Oracle and SQL Server. Do $$ Declare StructType Text Default '[{"empid" : "1", "last_name":"AccName1", "first_name":"AccName1", "deptid":"1", "salary":"1234.578"}, {"empid" : "2", "last_name":"AccName2", "first_name":"AccName2", "deptid":"2", "salary":"4567.578"}]'; Begin Insert Into emptable Select * From json_to_recordset(StructType::json) as x("empid" Int, "last_name" Varchar, "first_name" Varchar, "deptid" Int, "salary" Double Precision); End $$ Converting pivoting In PostgreSQL, the pivoting functionality is not enabled and requires an extension. The extension tablefunc enables the crosstab function, which you use creating pivot tables, similar to SQL Server and Oracle. The following is the pivoting functionality code in Oracle, SQL Server, and PostgreSQL. Create Table crosstabFunc ( id Number, customer_id Number, product_code Varchar2(5), quantity Number ); Insert Into crosstabFunc values (1, 1, 'A', 10); Insert Into crosstabFunc Values (2, 1, 'B', 20); Insert Into crosstabFunc Values (3, 1, 'C', 30); Insert Into crosstabFunc Values (4, 2, 'A', 40); Insert Into crosstabFunc Values (5, 2, 'C', 50); Insert Into crosstabFunc Values (6, 3, 'A', 60); Insert Into crosstabFunc Values (7, 3, 'B', 70); Insert Into crosstabFunc Values (8, 3, 'C', 80); Insert Into crosstabFunc Values (9, 3, 'D', 90); Insert Into crosstabFunc Values (10, 4, 'A', 100); Oracle Implement the pivoting functionality in Oracle with the following code. Select * From (Select customer_id, product_code, quantity From crosstabFunc) Pivot (Sum(quantity) As sum_quantity For (product_code) In ('A' AS a, 'B' AS b, 'C' AS c)) Order By customer_id; SQL Server Implement the pivoting functionality in SQL Server with the following code. Select * From (Select customer_id, product_code, quantity From crosstabFunc) as cf Pivot (Sum(quantity) For product_code In (A,B,C)) as cf1 Order By customer_id PostgreSQL Create the extension for PostgreSQL with the following code. Create Extension tablefunc; Select * From Crosstab (' Select customer_id, product_code, quantity From crosstabFunc' ) as T ( customer_id Int, "A" Int, "B" Int, "C" Int) Unpivoting to an array There is no Unpivot function available in PostgreSQL. When converting from SQL Server or Oracle to PostgreSQL, the unpivot is mapped to an array. See the following code for an example. Create Table Students ( Id Int Primary Key Identity, Student_Name Varchar (50), Math_marks Int, English_marks Int, History_marks Int, Science_marks Int ) Go Insert Into Students Values ('Sally', 87, 56, 78, 91 ) Insert Into Students Values ('Edward', 69, 80, 92, 98) Oracle Implement the unpivoting functionality in Oracle with the following sample code. Select StudentName, course,score From Students Unpivot (score For course In (Math_marks AS 'Maths', English_marks AS 'English', History_marks AS 'History', Science_marks As 'Science')); SQL Server Implement the unpivoting functionality in SQL Server with the following sample code. Select Student_Name, Course, Score From Students Unpivot ( Score For Course in (Math_marks, English_marks, History_marks, Science_marks) ) AS SchoolUnpivot PostgreSQL Implement the unpivoting functionality in PostgreSQL with the following sample code. Select Student_Name, course, score From ( Select Student_Name, Unnest (Array[ 'Math', 'English','History', 'Science'] ) As course, Unnest (Array[ Math_marks, English_marks,History_marks,Science_marks] ) As score From StudentsP ) AS Unpvt Returning multiple result sets from a function It is straightforward for SQL Server to return multiple result sets with multiple rows. You can accomplish the same in PostgreSQL and Oracle with cursors as given samples below. Oracle Return multiple result sets from a procedure in Oracle with the following code. Create Procedure Spgetdept23 (P_Cur Out Sys_Refcursor, P_Cur12 Out Sys_Refcursor) Is Begin Open P_Cur For Select * From employees; Open P_Cur12 For Select * From dept; End; var cur Refcursor var cur2 Refcursor Exec Spgetdept23(:cur,:cur2); Print cur; Print cur2; SQL Server Return multiple result sets from a procedure in SQL Server with the following code. No extra parameters are required in SQL Server. Create Procedure Dbo.Multiple_Reseultset As Begin Select * From HumanResources.Employee Select * From HumanResources.Department End To execute the procedure in SQL Server, enter the following code. Exec Dbo.Multiple_Reseultset To execute the procedure in SQL Server, enter the following code. Exec Dbo.Multiple_Reseultset PostgreSQL Return multiple result sets from a procedure in PostgreSQL with the following code. Create Or Replace Function Multiple_Reseultset() Returns Setof Refcursor As $$ Declare cur1 Refcursor; cur2 Refcursor; Begin Open cur1 For Select * From HumanResources.employee; Return Next cur1; Open cur2 For Select * From HumanResources. Department; Return Next cur2; End $$ Language 'plpgsql'; To execute the procedure in PostgreSQL, enter the following code. Begin Select * From Public.Multiple_Reseultset( ) Fetch All In "" Fetch All In "" End Inline queries with alias PostgreSQL semantics may refer to inline views as Subselect or Subquery. Oracle supports omitting aliases for the inner statement. In PostgreSQL and SQL Server, the use of aliases is mandatory. The following code examples use B as an alias. Oracle The following code is a sample inline query in Oracle. Select a.col1, col2_fromSubquery -- you can specify the columns directly from the subquery with out any prefix of subquery unless have common columns names. from emplyee a, (select * from salary ) where active=true SQL Server and PostgreSQL The same sample inline queries written in Oracle requires an alias name in SQL Server and PostgreSQL. Select a.col1, b.col2_fromSubquery from emplyee a, (select * from salary ) b where active=true Data order After migrating data from either Oracle or SQL Server to PostgreSQL, the retrieval order of the data may vary. The reason could be either the order of insertion or the data type of the column and its values or collation. To get the correct order of the data, identify the business need and apply the Order by clause on the query to match the data. dblink and foreign data wrappers dblink is the functionality used to communicate across homogeneous and heterogeneous databases. As of this post, Amazon RDS and Aurora PostgreSQL don’t offer heterogeneous support, but they do have support to communicate across the PostgreSQL databases. Communicating across homogeneous databases PostgreSQL support cross database communication with dblink and foreign data wrappers (FDWs) for cross-database communication. This section discusses how to use dblink and FDW. Using foreign data wrappers PostgreSQL supports FDWs, which you can use to access data stored in external sources. Amazon RDS and Aurora PostgreSQL support only PostgreSQL FDW currently. To configure the PostgreSQL FDW, complete the following steps. Create the extension with the following code. Create Extension postgres_fdw; Create the server and link to external databases with the following code. Create Server server_name1 Foreign Data Wrapper postgres_fdw Options (host abcd.rds.amazonaws.com' dbname abcd, port '5432'); Create the user mapping to access the tables from an external database with the following code. Create User Mapping For Current_User Server server_name1 Options (user 'pgar1234', password 'pgar1234'); Create user mapping for every user who would like to communicate via FDW. Import all the external tables into local schema to have access to the data from external tables just like regular tables accessed. Here is the sample code to import the tables from external database and schema. Create Schema imported_public2 -- created local schema Import Foreign Schema public From Server server_name1 Into imported_public2; -- This will import all the tables Select * From imported_public2.emptable Communicating across heterogeneous databases PostgreSQL doesn’t support cross-database communication. To have heterogeneous cross-database communication, Amazon Aurora PostgreSQL has limitations, but you can implement dblink on the source environment (for example, Oracle or SQL Server) to the target (PostgreSQL), and can either pull or push the data. For more information, see Cross-Database Querying in Compose PostgreSQL. Creating a view for a foreign database table with dblink dblink is a PostgreSQL contrib extension that allows you to perform short ad hoc queries in other databases. With the dblink option, the user must provide and store the password in clear text, which is visible to users. This option is not recommended unless you have no other choice. For more information, see Foreign Data Wrapper and postgres_fdw documentation. Option 1: Provide target database access details in the SQL statement itself In this option, the host connection and database credentials must be provided every time multiple places must change, such as any changes in the host or connection details. Create Or Replace View emptable_dblink As Select emptable.empid, emptable.last_name , emptable.first_name From Dblink('host=abcd.rds.amazonaws.com user=abcd password=abcd dbname=abcd port=5432', Select empid,last_name,first_name FROM emptable') AS emptable(empid Int,last_name Varchar , first_name Text ); Select * From emptable_dblink; Option 2: Separate out access details and use connection object In this option, host and connection details are defined at one place and use the connection name to have the cross-database connections. Select Dblink_Connect('conName','dbname=abcd user=abcd password=abcd host= abcd.rds.amazonaws.com '); Create Or Replace View mytabview1 As Select mytable.* From Dblink('conName', Select empid,last_name,first_name FROM emptable') As mytable(empid Int,last_name Varchar , first_name Text); Select * From mytabview1; Function call with dblink The following code is a function from a foreign PostgreSQL database that returns an integer. Select * From Dblink('host=abcd.rds.amazonaws.com user=abcd password=abcd dbname=postgres port=5432', 'Select public.add_ten(10)') As add_ten(a Int); The following code is a function from a foreign PostgreSQL database that returns a table type. Select Dblink_Connect('conName','dbname=pgar1234 user=pgar1234 password=pgar1234 host=pgar1234.ctegx79rcs0q.ap-south-1.rds.amazonaws.com'); Select Dblink_Open('conName','foo2', 'Select * From public.tabletypetest(10)'); Select * From Dblink_Fetch('conName','foo2', 5) As (empid Int, last_name Varchar); Finding the maximum and minimum value of a set of numbers You may need maximum and minimum values when migrating to PostgreSQL. PostgreSQL includes a function to find these values, as demonstrated with the following code. Select Greatest(1,2,3,50,100) -> 100 Select Least(1,2,3,50,100) -> 1 Considering self-join for updates Updates work differently in PostgreSQL compared to SQL Server if you are using the same source table (the table that is getting updated) in the from clause of select statement. In PostgreSQL, the second reference in the from clause is independent of first reference, unlike SQL Server, and the changes are applied to the entire table. The following code example updates salaries for employees from Department 1. Update employee Set salary = employee.salary + employee.salary * 0.10 From Employee e Join dept d on d.deptid = e.deptid Where d.deptid=1 This function works the same in SQL Server, but when you migrate, the same SQL statement updates the entire table rather than a single department. PostgreSQL works differently, it assumes that the two employee tables are independent from each other, unlike SQL Server. To update a single department, convert the DML to the following code. Update Employee e Set salary = e.salary + e.salary * 0.10 From dept d Where d.deptid = e.deptid And d.deptid=1 If using Oracle, convert the DML to the following code. Update Employee e Set Salary = e.salary + e.salary * 0.10 Where Exists (Select 1 from dept d where d.deptid = e.deptid And d.deptid=1 ) Summary This post shared some tips and best practices for developers working on migrations from commercial databases to PostgreSQL. This post highlights many decisions you must make during the migration and how they can impact your database performance. Keeping these performance aspects in mind during the conversion can help avoid performance issues later on during migration. If you have any questions or comments about this post, please share your thoughts in the comment section.     About the Author Viswanatha Shastry Medipalli is a Consultant with the AWS ProServe team in India. His background spans a wide depth and breadth of expertise and experience in SQL database migrations. He has architected and designed many successful database solutions addressing challenging business requirements. He has provided solutions using Oracle, SQL Server and PostgreSQL for reporting, business intelligence, applications, and development support. He also has a good knowledge of automation, and orchestration. His focus area is homogeneous and heterogeneous migrations of on-premise databases to Amazon RDS and Aurora PostgreSQL.     https://probdm.com/site/NzAxMg
0 notes