Partial Partition-Wise Joins: Single-Level Partitioning

The simplest method to enable a partial partition-wise join is to partition sales by hash on cust_id. The number of partitions determines the maximum degree of parallelism, because the partition is the smallest granule of parallelism for partial partition-wise join operations.

The parallel execution of a partial partition-wise join, illustrated in the following figure, assumes that both the degree of parallelism and the number of partitions of sales are 16. The execution involves two sets of query servers: one set, labeled set 1, scans the customers table in parallel. The granule of parallelism for the scan operation is a range of blocks.

Rows from customers that are selected by the first set, in this case all rows, are redistributed to the second set of query servers by hashing cust_id. For example, all rows in customers that could have matching rows in partition P1 of sales are sent to query server 1 in the second set. Rows received by the second set of query servers are joined with the rows from the corresponding partitions in sales. Query server number 1 in the second set joins all customers rows that it receives with partition P1 of sales.

Figure 3-3 Partial Partition-Wise Join

Description of "Figure 3-3 Partial Partition-Wise Join"

The example below shows the execution plan for the partial partition-wise join between sales and customers.

-----------------------------------------------------------------------------------------------
| Id  | Operation                             | Name      | Pstart| Pstop |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |           |       |       |      |            |
|   1 |  PX COORDINATOR                       |           |       |       |      |            |
|   2 |   PX SEND QC (RANDOM)                 | :TQ10002  |       |       | P->S | QC (RAND)  |
|*  3 |    FILTER                             |           |       |       | PCWC |            |
|   4 |     HASH GROUP BY                     |           |       |       | PCWP |            |
|   5 |      PX RECEIVE                       |           |       |       | PCWP |            |
|   6 |       PX SEND HASH                    | :TQ10001  |       |       | P->P | HASH       |
|   7 |        HASH GROUP BY                  |           |       |       | PCWP |            |
|*  8 |         HASH JOIN                     |           |       |       | PCWP |            |
|   9 |          PART JOIN FILTER CREATE      | :BF0000   |       |       | PCWP |            |
|  10 |           PX RECEIVE                  |           |       |       | PCWP |            |
|  11 |            PX SEND PARTITION (KEY)    | :TQ10000  |       |       | P->P | PART (KEY) |
|  12 |             PX BLOCK ITERATOR         |           |       |       | PCWC |            |
|  13 |              TABLE ACCESS FULL        | CUSTOMERS |       |       | PCWP |            |
|  14 |          PX PARTITION HASH JOIN-FILTER|           |:BF0000|:BF0000| PCWC |            |
|* 15 |           TABLE ACCESS FULL           | SALES     |:BF0000|:BF0000| PCWP |            |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(COUNT(SYS_OP_CSR(SYS_OP_MSR(COUNT(*)),0))>100)
   8 - access("S"."CUST_ID"="C"."CUST_ID")
  15 - filter("S"."TIME_ID"<=TO_DATE(' 1999-10-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND 
"S"."TIME_ID">=TO_DATE(' 1999-07-01
              00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

This query runs in parallel, as displayed in the plan, because there are PX row sources. One table is partitioned, which is the SALES table. You can determine this because the PX PARTITION HASH row source contains a nonpartitioned table CUSTOMERS that is distributed through PX SEND PARTITION to a different slave set that performs the join.

Note:

The Rows, Cost (%CPU), Time, and TQ columns were removed from the plan table output in this example.

Note:

This section is based on hash partitioning, but it also applies for range, list, and interval partial partition-wise joins.

Considerations for full partition-wise joins also apply to partial partition-wise joins:

The degree of parallelism does not need to equal the number of partitions. In the figure, the query executes with two sets of 16 query servers. In this case, Oracle assigns 1 partition to each query server of the second set. Again, the number of partitions should always be a multiple of the degree of parallelism.
In Oracle RAC environments on MPPs, each hash partition of sales should preferably have affinity to only one node to avoid remote I/Os. Also, spread partitions over all nodes to avoid bottlenecks and use all CPU resources available on the system. A node can host multiple partitions when there are more partitions than nodes.

See Also:

Oracle Real Application Clusters Administration and Deployment Guide for more information about data affinity