@@ -14,6 +14,11 @@ easily optimize batch insertion, and also allows "noisy" data (values not in a
1414suitable format) to be filtered for review while other, correct, values are
1515inserted.
1616
17+ In addition to Oracle Database "Array DML" batch loading,
18+ :ref: `directpathloads ` can be used for very fast loading of large data sets if
19+ certain schema criteria can be met. Another option for frequent, small inserts
20+ is to load data using the Oracle Database :ref: `memoptimized `.
21+
1722Related topics include :ref: `tuning ` and :ref: `dataframeformat `.
1823
1924Batch Statement Execution
@@ -618,3 +623,155 @@ B19E-449D-9968-1121AF06D793>`__ between the databases and using
618623INSERT INTO SELECT or CREATE AS SELECT.
619624
620625You can control the data transfer by changing your SELECT statement.
626+
627+ .. _directpathloads :
628+
629+ Direct Path Loads
630+ =================
631+
632+ Direct Path Loads allows data being inserted into Oracle Database to bypass
633+ code layers such as the database buffer cache. Also there are no INSERT
634+ statements used. This can be very efficient for ingestion of huge amounts of
635+ data but, as a consequence of the architecture, there are restrictions on when
636+ Direct Path Loads can be used. For more information see Oracle Database
637+ documentation such as on SQL*Loader `Direct Path Loads
638+ <https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=
639+ GUID-0D576DEF-7918-4DD2-A184-754D217C021F> `__ and on the Oracle Call Interface
640+ `Direct Path Load Interface
641+ <https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=
642+ GUID-596F5F9B-47A1-48DB-8702-FEED7BE038B9> `__.
643+
644+ The end-to-end insertion time when using Direct Path Loads for smaller data
645+ sets may not be faster than using :meth: `Cursor.executemany() `, however there
646+ can still be reduced load on the database.
647+
648+ .. note ::
649+
650+ Direct Path Loads are only supported in python-oracledb Thin mode.
651+
652+ Direct Path Loading is performed by the :meth: `Connection.direct_path_load() `
653+ method. For example, if you have the table::
654+
655+ create table TestDirectPathLoad (
656+ id number(9),
657+ name varchar2(20)
658+ );
659+
660+ Then you can load data into it using the code:
661+
662+ .. code-block :: python
663+
664+ SCHEMA_NAME = " HR"
665+ TABLE_NAME = " TESTDIRECTPATHLOAD"
666+ COLUMN_NAMES = [" ID" , " NAME" ]
667+ DATA = [
668+ (1 , " A first row" ),
669+ (2 , " A second row" ),
670+ (3 , " A third row" ),
671+ ]
672+
673+ connection.direct_path_load(
674+ schema_name = SCHEMA_NAME ,
675+ table_name = TABLE_NAME ,
676+ column_names = COLUMN_NAMES ,
677+ data = DATA
678+ )
679+
680+ The records are always implicitly committed.
681+
682+ The ``data `` parameter can be a list of sequences, a :ref: `DataFrame
683+ <oracledataframeobj>` object, or a third-party DataFrame instance that supports
684+ the Apache Arrow PyCapsule Interface, see :ref: `dfppl `.
685+
686+ To load into VECTOR columns, pass an appropriate `Python array.array()
687+ <https://docs.python.org/3/library/array.html> `__ value, or a list of values.
688+ For example, if you have the table::
689+
690+ create table TestDirectPathLoad (
691+ id number(9),
692+ name varchar2(20),
693+ v64 vector(3, float64)
694+ );
695+
696+ Then you can load data into it using the code:
697+
698+ .. code-block :: python
699+
700+ SCHEMA_NAME = " HR"
701+ TABLE_NAME = " TESTDIRECTPATHLOAD"
702+ COLUMN_NAMES = [" ID" , " NAME" , " V64" ]
703+ DATA = [
704+ (1 , " A first row" , array.array(" d" , [1 , 2 , 3 ])),
705+ (2 , " A second row" , [4 , 5 , 6 ]),
706+ (3 , " A third row" , array.array(" d" , [7 , 8 , 9 ])),
707+ ]
708+
709+ connection.direct_path_load(
710+ schema_name = SCHEMA_NAME ,
711+ table_name = TABLE_NAME ,
712+ column_names = COLUMN_NAMES ,
713+ data = DATA
714+ )
715+
716+
717+ For more on vectors, see :ref: `vectors `.
718+
719+ Runnable Direct Path Load examples are in the `GitHub examples
720+ <https://github.com/oracle/python-oracledb/tree/main/samples> `__ directory.
721+
722+ **Notes on Direct Path Loads **
723+
724+ - Data is implicitly committed.
725+ - Data being inserted into CLOB or BLOB columns must be strings or bytes, not
726+ python-oracledb :ref: `LOB Objects <lobobj >`.
727+ - Insertion of python-oracledb :ref: `DbObjectType Objects <dbobjecttype >` is
728+ not supported
729+
730+ Review Oracle Database documentation for database requirements and
731+ restrictions.
732+
733+ Batching of Direct Path Loads
734+ -----------------------------
735+
736+ If buffer, network, or database limits make it desirable to process smaller
737+ sets of records, you can either make repeated calls to
738+ :meth: `Connection.direct_path_load() ` or you can use the ``batch_size ``
739+ parameter. For example:
740+
741+ .. code-block :: python
742+
743+ SCHEMA_NAME = " HR"
744+ TABLE_NAME = " TESTDIRECTPATHLOAD"
745+ COLUMN_NAMES = [" ID" , " NAME" ]
746+ DATA = [
747+ (1 , " A first row" ),
748+ (2 , " A second row" ),
749+ . . .
750+ (10_000_000 , " Ten millionth row" ),
751+ ]
752+
753+ connection.direct_path_load(
754+ schema_name = SCHEMA_NAME ,
755+ table_name = TABLE_NAME ,
756+ column_names = COLUMN_NAMES ,
757+ data = DATA ,
758+ batch_size = 1_000_000
759+ )
760+
761+ This will send the data to the database in batches of 1,000,000 records until
762+ all 10,000,000 records have been inserted.
763+
764+ .. _memoptimized :
765+
766+ Memoptimized Rowstore
767+ =====================
768+
769+ The Memoptimized Rowstore is another Oracle Database feature for data
770+ ingestion, particularly for frequent single row inserts. It can also aid query
771+ performance. Configuration and control is handled by database configuration and
772+ the use of specific SQL statements. As a result, there is no specific
773+ python-oracledb requirement or API needed to take advantage of the feature.
774+
775+ To use the Memoptimized Rowstore see Oracle Database documentation `Enabling
776+ High Performance Data Streaming with the Memoptimized Rowstore
777+ <https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=GUID-9752E93D-55A7-4584-B09B-9623B33B5CCF> `__.
0 commit comments