The FDW callback functions GetForeignRelSize
,
GetForeignPaths
, GetForeignPlan
, and
PlanForeignModify
must fit into the workings of the
PostgreSQL™ planner. Here are some notes about what
they must do.
The information in root
and baserel
can be used
to reduce the amount of information that has to be fetched from the
foreign table (and therefore reduce the cost).
baserel->baserestrictinfo
is particularly interesting, as
it contains restriction quals (WHERE
clauses) that should be
used to filter the rows to be fetched. (The FDW itself is not required
to enforce these quals, as the core executor can check them instead.)
baserel->reltargetlist
can be used to determine which
columns need to be fetched; but note that it only lists columns that
have to be emitted by the ForeignScan plan node, not
columns that are used in qual evaluation but not output by the query.
Various private fields are available for the FDW planning functions to keep information in. Generally, whatever you store in FDW private fields should be palloc'd, so that it will be reclaimed at the end of planning.
baserel->fdw_private
is a void pointer that is
available for FDW planning functions to store information relevant to
the particular foreign table. The core planner does not touch it except
to initialize it to NULL when the baserel
node is created.
It is useful for passing information forward from
GetForeignRelSize
to GetForeignPaths
and/or
GetForeignPaths
to GetForeignPlan
, thereby
avoiding recalculation.
GetForeignPaths
can identify the meaning of different
access paths by storing private information in the
fdw_private
field of ForeignPath nodes.
fdw_private
is declared as a List pointer, but
could actually contain anything since the core planner does not touch
it. However, best practice is to use a representation that's dumpable
by nodeToString
, for use with debugging support available
in the backend.
GetForeignPlan
can examine the fdw_private
field of the selected ForeignPath node, and can generate
fdw_exprs
and fdw_private
lists to be
placed in the ForeignScan plan node, where they will be
available at execution time. Both of these lists must be
represented in a form that copyObject
knows how to copy.
The fdw_private
list has no other restrictions and is
not interpreted by the core backend in any way. The
fdw_exprs
list, if not NIL, is expected to contain
expression trees that are intended to be executed at run time. These
trees will undergo post-processing by the planner to make them fully
executable.
In GetForeignPlan
, generally the passed-in target list can
be copied into the plan node as-is. The passed scan_clauses
list
contains the same clauses as baserel->baserestrictinfo
,
but may be re-ordered for better execution efficiency. In simple cases
the FDW can just strip RestrictInfo nodes from the
scan_clauses
list (using extract_actual_clauses
) and put
all the clauses into the plan node's qual list, which means that all the
clauses will be checked by the executor at run time. More complex FDWs
may be able to check some of the clauses internally, in which case those
clauses can be removed from the plan node's qual list so that the
executor doesn't waste time rechecking them.
As an example, the FDW might identify some restriction clauses of the
form foreign_variable
=
sub_expression
, which it determines can be executed on
the remote server given the locally-evaluated value of the
sub_expression
. The actual identification of such a
clause should happen during GetForeignPaths
, since it would
affect the cost estimate for the path. The path's
fdw_private
field would probably include a pointer to
the identified clause's RestrictInfo node. Then
GetForeignPlan
would remove that clause from scan_clauses
,
but add the sub_expression
to fdw_exprs
to ensure that it gets massaged into executable form. It would probably
also put control information into the plan node's
fdw_private
field to tell the execution functions what
to do at run time. The query transmitted to the remote server would
involve something like WHERE
, with the parameter value obtained at run time from
evaluation of the foreign_variable
=
$1fdw_exprs
expression tree.
The FDW should always construct at least one path that depends only on
the table's restriction clauses. In join queries, it might also choose
to construct path(s) that depend on join clauses, for example
foreign_variable
=
local_variable
. Such clauses will not be found in
baserel->baserestrictinfo
but must be sought in the
relation's join lists. A path using such a clause is called a
“parameterized path”. It must identify the other relations
used in the selected join clause(s) with a suitable value of
param_info
; use get_baserel_parampathinfo
to compute that value. In GetForeignPlan
, the
local_variable
portion of the join clause would be added
to fdw_exprs
, and then at run time the case works the
same as for an ordinary restriction clause.
When planning an UPDATE or DELETE,
PlanForeignModify
can look up the RelOptInfo
struct for the foreign table and make use of the
baserel->fdw_private
data previously created by the
scan-planning functions. However, in INSERT the target
table is not scanned so there is no RelOptInfo for it.
The List returned by PlanForeignModify
has
the same restrictions as the fdw_private
list of a
ForeignScan plan node, that is it must contain only
structures that copyObject
knows how to copy.
For an UPDATE or DELETE against an external data
source that supports concurrent updates, it is recommended that the
ForeignScan
operation lock the rows that it fetches, perhaps
via the equivalent of SELECT FOR UPDATE. The FDW may also
choose to lock rows at fetch time when the foreign table is referenced
in a SELECT FOR UPDATE/SHARE; if it does not, the
FOR UPDATE
or FOR SHARE
option is essentially a
no-op so far as the foreign table is concerned. This behavior may yield
semantics slightly different from operations on local tables, where row
locking is customarily delayed as long as possible: remote rows may get
locked even though they subsequently fail locally-applied restriction or
join conditions. However, matching the local semantics exactly would
require an additional remote access for every row, and might be
impossible anyway depending on what locking semantics the external data
source provides.