Skip to content

Database

CLASS DESCRIPTION
Dataset

A Container for managing workspace connections.

Relationship
RelationshipManager

Dataset

Dataset(
    conn: str | Path,
    *,
    parent: Dataset[Any] | None = None,
    _walk_method: Literal[
        "sync", "threaded", "raw"
    ] = "raw",
)

A Container for managing workspace connections.

A Dataset is initialized using arcpy.da.Walk and will discover all child datasets, tables, and featureclasses. These discovered objects can be accessed by name directly (e.g. dataset['featureclass_name']) or by inspecting the property of the type they belong to (e.g. dataset.feature_classes['featureclass_name']). The benefit of the second method is that you will be able to know you are getting a FeatureClass, Table, or Dataset object.

Usage
>>> dataset = Dataset('dataset/path')
>>> fc1 = dataset.feature_classes['fc1']
>>> fc1 = dataset.feature_classes['fc2']
>>> len(fc1)
243
>>> len(fc2)
778

>>> count(dataset['fc1'][where('LENGTH > 500')])
42
>>> sum(dataset['fc2']['TOTAL'])
3204903

As you can see, the dataset container makes it incredibly easy to interact with data concisely and clearly.

Datasets also implement __contains__ which allows you to check membership from the root node:

Example
>>> 'fc1' in dataset
True
>>> 'fc6' in dataset
True
>>> list(dataset.feature_classes)
['fc1', 'fc2']
>>> list(dataset.datasets)
['ds1']
>>> list(dataset['ds1'].feature_classes)
['fc3', 'fc4', 'fc5', 'fc6']
METHOD DESCRIPTION
create_feature_dataset

Create a FeatureDataset (cannot be done if the parent Dataset is already a FeatureDataset!)

create_featureclass

Create a new FeatureClass in the Dataset

create_table

Create a new Table in the Dataset

export_rules

Export all attribute rules from the dataset into feature subdirectories

export_schema

Export the workspace Schema for a GDB dataset

export_schema_module

Export the workspace to a python schema file that uses TypedDict and Annotated

from_schema

Create a GDB from a schema file (xlsx, json, xml) generated by export_schema

from_schema_module

Build a new GDB from an existing schema module generated with export_schema_module

import_rules

Import Attribute rules for the dataset from a directory

walk

Traverse the connection/path using arcpy.da.Walk and discover all dataset children

ATTRIBUTE DESCRIPTION
annotations

A mapping of annotation names to FeatureClass objects

TYPE: dict[str, FeatureClass]

datasets

A mapping of dataset names to child Dataset objects

TYPE: dict[str, Dataset[Any]]

feature_classes

A mapping of featureclass names to FeatureClass objects in the dataset root

TYPE: dict[str, FeatureClass[Any, Any]]

relationships

A Manager object for interacting with RelationshipClasses

TYPE: RelationshipManager

tables

A mapping of table names to Table objects in the dataset root

TYPE: dict[str, Table[Any]]

Source code in src/arcpie/database.py
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
def __init__(self, conn: str | Path, *, 
             parent: Dataset[Any] | None = None, 
             _walk_method: Literal['sync', 'threaded', 'raw'] = 'raw'
    ) -> None:
    self.conn = Path(conn)
    self.parent = parent

    self._datasets = {}
    self._feature_classes = {}
    self._tables = {}
    self._relationships = {}
    self._annotations = {}

    # Force root dataset to be a gdb, pointing to a folder can cause issues with Walk
    if self.parent is None and self.conn.suffix != '.gdb':
        raise ValueError('Root Dataset requires a valid gdb path!')

    # Traverse the dataset or its parent (all child datasets are subsets of their parent)
    self._walk_method = _walk_method
    self.walk(_method=self._walk_method) if self.parent is None else self._walk_parent()

annotations property

annotations: dict[str, FeatureClass]

A mapping of annotation names to FeatureClass objects

datasets property

datasets: dict[str, Dataset[Any]]

A mapping of dataset names to child Dataset objects

feature_classes property

feature_classes: dict[str, FeatureClass[Any, Any]]

A mapping of featureclass names to FeatureClass objects in the dataset root

relationships property

relationships: RelationshipManager

A Manager object for interacting with RelationshipClasses

tables property

tables: dict[str, Table[Any]]

A mapping of table names to Table objects in the dataset root

create_feature_dataset

create_feature_dataset(
    name: str,
    *,
    spatial_reference: SpatialReference | WKID = WGS84,
) -> Dataset

Create a FeatureDataset (cannot be done if the parent Dataset is already a FeatureDataset!)

PARAMETER DESCRIPTION

name

The name for the new feature dataset (must be unique in parent dataset!)

TYPE: str

spatial_reference

An optional spatial reference to use for the dataset (default WGS84/EPSG:4326)

TYPE: SpatialReference | WKID DEFAULT: WGS84

RETURNS DESCRIPTION
Dataset

A new Dataset object parented to this Dataset

Source code in src/arcpie/database.py
513
514
515
516
517
518
519
520
521
522
523
524
525
def create_feature_dataset(self, name: str, *, spatial_reference: SpatialReference | WKID = WGS84) -> Dataset:
    """Create a FeatureDataset (cannot be done if the parent Dataset is already a FeatureDataset!)

    Args:
        name: The name for the new feature dataset (must be unique in parent dataset!)
        spatial_reference: An optional spatial reference to use for the dataset (default `WGS84`/`EPSG:4326`)

    Returns:
        A new Dataset object parented to this Dataset
    """
    if isinstance(spatial_reference, int):
        spatial_reference = SpatialReference(spatial_reference)
    return Dataset(CreateFeatureDataset(str(self.conn), name, spatial_reference=spatial_reference)[0], parent=self)

create_featureclass

create_featureclass(
    name: str,
    geometry_type: GeoType | None,
    feature_dataset: str | None = None,
    *,
    template: FeatureClass | None = None,
    has_m: bool | None = None,
    has_z: bool | None = None,
    spatial_reference: SpatialReference | WKID = WGS84,
    config_keyword: str | None = None,
    alias: str | None = None,
    oid_type: Literal["64_BIT", "32_BIT"] | None = "64_BIT",
    _ensure_dataset: bool = True,
) -> FeatureClass

Create a new FeatureClass in the Dataset

Source code in src/arcpie/database.py
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
def create_featureclass(self, name: str, geometry_type: GeoType | None, feature_dataset: str | None = None,
                        *,
                        template: FeatureClass | None = None,
                        has_m: bool | None = None,
                        has_z: bool | None = None,
                        spatial_reference: SpatialReference | WKID = WGS84,
                        config_keyword: str | None = None,
                        alias: str | None = None,
                        oid_type: Literal['64_BIT', '32_BIT'] | None = '64_BIT',
                        _ensure_dataset: bool = True,
    ) -> FeatureClass:
    """Create a new FeatureClass in the Dataset"""
    if feature_dataset:
        path = str(self.conn / feature_dataset)
        # Since we're modifying the dataset, we need to re-index
        if _ensure_dataset:
            self.walk()
            if feature_dataset not in self.datasets:
                self.create_feature_dataset(feature_dataset, spatial_reference=spatial_reference)
    else:
        path = str(self.conn)
    return FeatureClass(
        CreateFeatureclass(
            out_path=path,
            out_name=name,
            geometry_type=geometry_type,
            template=str(template.path) if template else None,
            has_m='SAME_AS_TEMPLATE' if has_m is None and template is not None else ('ENABLED' if has_m else 'DISABLED'),
            has_z='SAME_AS_TEMPLATE' if has_z is None and template is not None else ('ENABLED' if has_z else 'DISABLED'),
            spatial_reference=spatial_reference,
            config_keyword=config_keyword,
            out_alias=alias,
            oid_type='SAME_AS_TEMPLATE' if oid_type is None and template is not None else oid_type,
        )[0]
    )

create_table

create_table(
    name: str,
    *,
    template: Table | None = None,
    config_keyword: str | None = None,
    alias: str | None = None,
    oid_type: Literal["64_BIT", "32_BIT"] | None = "64_BIT",
) -> Table

Create a new Table in the Dataset

PARAMETER DESCRIPTION

name

The name of the new table

TYPE: str

template

A Table object to use as a template

TYPE: Table | None DEFAULT: None

config_keyword

A keyword to pass to the database engine for table setup

TYPE: str | None DEFAULT: None

alias

An alias for the table

TYPE: str | None DEFAULT: None

oid_type

The size of OID to use. If template is set and this is None, template OID type is used

TYPE: Literal['64_BIT', '32_BIT'] | None DEFAULT: '64_BIT'

Source code in src/arcpie/database.py
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
def create_table(self, name: str,
                 *,
                 template: Table | None = None,
                 config_keyword: str | None = None,
                 alias: str | None = None,
                 oid_type: Literal["64_BIT", "32_BIT"] | None = '64_BIT',
    ) -> Table:
    """Create a new Table in the Dataset

    Args:
        name: The name of the new table
        template: A Table object to use as a template
        config_keyword: A keyword to pass to the database engine for table setup
        alias: An alias for the table
        oid_type: The size of OID to use. If template is set and this is `None`, template OID type is used
    """

    return Table(
        CreateTable(
            out_path=str(self.conn),
            out_name=name,
            template=str(template.path) if template else None,
            config_keyword=config_keyword,
            out_alias=alias,
            oid_type='SAME_AS_TEMPLATE' if template and oid_type is None else oid_type
        )[0]
    )

export_rules

export_rules(
    rule_dir: Path | str,
) -> Iterator[AttributeRule]

Export all attribute rules from the dataset into feature subdirectories

PARAMETER DESCRIPTION

rule_dir

The target directory for the rules

TYPE: Path | str

Usage
>>> # Transfer rules from one dataset to another
>>> ds.export_rules('my_rules')
>>> ds2.import_rules('my_rules')
Source code in src/arcpie/database.py
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
def export_rules(self, rule_dir: Path|str) -> Iterator[AttributeRule]:
    """Export all attribute rules from the dataset into feature subdirectories

    Args:
        rule_dir (Path|str): The target directory for the rules

    Usage:
        ```python
        >>> # Transfer rules from one dataset to another
        >>> ds.export_rules('my_rules')
        >>> ds2.import_rules('my_rules')
        ```
    """
    for feature_class in self.feature_classes.values():
        yield from feature_class.attribute_rules.export_rules(Path(rule_dir))

export_schema

export_schema(
    out_loc: Path | str,
    *,
    schema_name: str | None = None,
    out_format: Literal[
        "JSON", "XLSX", "HTML", "PDF", "XML"
    ] = "JSON",
    remove_rules: bool = False,
) -> Path

Export the workspace Schema for a GDB dataset

PARAMETER DESCRIPTION

out_loc

The output location for the workspace schema

TYPE: Path | str

schema_name

A name for the schema (default: Dataset.name)

TYPE: str DEFAULT: None

out_format

The output format (default: 'json')

TYPE: Literal['json', 'xml', 'xlsx', 'html'] DEFAULT: 'JSON'

remove_rules

Don't export associated attribute rules for the dataset (default: False)

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
Path

The Path object pointing to the output file

TYPE: Path

Source code in src/arcpie/database.py
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
def export_schema(self, out_loc: Path|str,
                  *,
                  schema_name: str|None=None, 
                  out_format: Literal['JSON', 'XLSX', 'HTML', 'PDF', 'XML']='JSON',
                  remove_rules: bool=False) -> Path:
    """Export the workspace Schema for a GDB dataset

    Args:
        out_loc (Path|str): The output location for the workspace schema
        schema_name (str): A name for the schema (default: Dataset.name)
        out_format (Literal['json', 'xml', 'xlsx', 'html']): The output format (default: 'json')
        remove_rules (bool): Don't export associated attribute rules for the dataset (default: False)

    Returns:
        Path : The Path object pointing to the output file
    """
    out_loc = Path(out_loc)
    out_loc.mkdir(exist_ok=True, parents=True)
    name = schema_name or self.name
    outfile = (out_loc / name).with_suffix(f'.{out_format.lower()}')
    workspace = json.load(convert_schema(self, out_format))
    schema = patch_schema_rules(workspace, remove_rules=remove_rules)
    with outfile.open('w') as f:
        json.dump(schema, f, indent=2)
    return outfile

export_schema_module

export_schema_module(
    out_loc: Path | str,
    *,
    tables: bool | Sequence[str] = True,
    featureclasses: bool | Sequence[str] = True,
    datasets: bool | Sequence[str] = True,
    mod_doc: str | None = None,
    fallback_type: type = object,
    docs: dict[str, dict[str, str]] | None = None,
    include_shape_token: bool = True,
    include_oid_token: bool = True,
    default_doc: Callable[[Field], str]
    | None
    | Literal["nodoc"] = None,
    skip_annotations: bool = False,
) -> None

Export the workspace to a python schema file that uses TypedDict and Annotated to store field definitions. This is similar to Pydantic models, but these can be ingested by Table and FeatureClass objects to type their iterators

PARAMETER DESCRIPTION

tables

Include table schemas in output (Only specified names if a sequence is provided)

TYPE: bool | Sequence[str] DEFAULT: True

featureclasses

Include featureclasses in output (Only specified names if a sequence is provided)

TYPE: bool | Sequence[str] DEFAULT: True

datasets

Include schemas for datasets in the output (Only specified names if a sequence is provided)

TYPE: bool | Sequence[str] DEFAULT: True

out_loc

The filepath of the output module (e.g. <root>/schemas/db_schema.py)

TYPE: Path | str

mod_doc

Optional module documentation to include at the top of the file (default: {self.name} Schema)

TYPE: str | None DEFAULT: None

fallback_type

Default type for any fieldtype that can't be mapped to a Python type

TYPE: type DEFAULT: object

docs

Optional docs for each feature class in the format {'Feature': {'Field': 'Field Doc', ...}, ...}

TYPE: dict[str, dict[str, str]] | None DEFAULT: None

include_shape_token

Include @SHAPE in output schema (will inherit from FC shape)

TYPE: bool DEFAULT: True

include_oid_token

Include the @OID token in the output schema

TYPE: bool DEFAULT: True

default_doc

Optional default docstring func for fields ('nodoc' will exclude docstring from output)

TYPE: Callable[[Field], str] | None | Literal['nodoc'] DEFAULT: None

skip_annotations

Don't export schema for Annotation Features

TYPE: bool DEFAULT: False

Note: If the supplied out_loc is not a valid .py python file, a python file with the name {self.name}_schema.py will be generated there. Intermediate folders will be created if they do not exist.

Source code in src/arcpie/database.py
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
def export_schema_module(self, out_loc: Path|str, 
                       *,
                       tables: bool | Sequence[str] = True,
                       featureclasses: bool | Sequence[str] = True,
                       datasets: bool | Sequence[str] = True,
                       mod_doc: str | None = None,
                       fallback_type: type = object,
                       docs: dict[str, dict[str, str]] | None = None,
                       include_shape_token: bool = True,
                       include_oid_token: bool = True,
                       default_doc: Callable[[Field], str] | None | Literal['nodoc'] = None,
                       skip_annotations: bool = False,
    ) -> None:
    """Export the workspace to a python schema file that uses TypedDict and Annotated 
    to store field definitions. This is similar to Pydantic models, but these can be ingested by
    Table and FeatureClass objects to type their iterators

    Args:
        tables: Include table schemas in output (Only specified names if a sequence is provided)
        featureclasses: Include featureclasses in output (Only specified names if a sequence is provided)
        datasets: Include schemas for datasets in the output (Only specified names if a sequence is provided)
        out_loc: The filepath of the output module (e.g. `<root>/schemas/db_schema.py`)
        mod_doc: Optional module documentation to include at the top of the file (default: `{self.name} Schema`)
        fallback_type: Default type for any fieldtype that can't be mapped to a Python type
        docs: Optional docs for each feature class in the format `{'Feature': {'Field': 'Field Doc', ...}, ...}`
        include_shape_token: Include @SHAPE in output schema (will inherit from FC shape)
        include_oid_token: Include the @OID token in the output schema
        default_doc: Optional default docstring func for fields (`'nodoc'` will exclude docstring from output)
        skip_annotations: Don't export schema for Annotation Features
    Note:
        If the supplied out_loc is not a valid `.py` python file, a python file with the name 
        `{self.name}_schema.py` will be generated there. Intermediate folders will be created if 
        they do not exist. 
    """
    from .schema.field import SCHEMA_IMPORTS
    if mod_doc:
        mod_doc = SCHEMA_IMPORTS.format(mod_doc)
    else:
        mod_doc = SCHEMA_IMPORTS.format(f'{self.name} Schema')

    out_loc = Path(out_loc)
    if out_loc.suffix != '.py':
        out_loc = out_loc / f'{self.name}.py'
    out_loc.parent.mkdir(exist_ok=True, parents=True)

    _items: list[FeatureClass | Table] = []

    # Gather all requested FeatureClasses and Tables
    _features = []
    if featureclasses:
        # Skip annotations since they have additional interfaces that aren't modeled
        if isinstance(featureclasses, Sequence):
            _features = [
                fc 
                for fc in self.feature_classes.values()
                if fc.name in featureclasses
            ]
        else:
            _features = list(self.feature_classes.values())
        _items.extend(_features)

    _tables = []
    if tables:
        if isinstance(tables, Sequence):
            _tables = [
                tbl
                for tbl in self.tables.values()
                if tbl.name in tables
            ]
        else:
            _tables = list(self.tables.values())
        _items.extend(_tables)

    _datasets = []
    if datasets:
        if isinstance(datasets, Sequence):
            _datasets = [
                ds 
                for ds in self.datasets.values() 
                if ds.name in datasets
            ]
        else:
            _datasets = list(self.datasets.values())
    with out_loc.open('wt') as fl:
        fl.write(mod_doc)

        # Notate root for later parsing operations
        fl.write("# Entry Point for parser\n")
        fl.write(f'SCHEMA_ROOT = "{self.name}"\n\n')
        for item in _items:
            # Extract any supplied FC docs
            doc = docs.get(item.name) if docs else None
            fl.write(
                item.get_schema(
                    fallback_type=fallback_type, 
                    docs=doc, 
                    include_shape_token=include_shape_token, 
                    include_oid_token=include_oid_token,
                    default_doc=default_doc,
                )
            )
            fl.write('\n\n')

        if _datasets:
            fl.write('# Dataset Definitions\n\n')

        ds_items: set[str] = set()
        for ds in _datasets:
            _ds_children = list(filter(lambda i: i.name in ds, _items))
            fl.write(f"class {ds.name}(TypedDict):\n")
            fl.write('    """FeatureDataset"""\n\n')
            for item in _ds_children:
                fl.write(f"    {item.name}: {item.name}\n")
                ds_items.add(item.name)
            fl.write('\n\n')

        fl.write("# Root Schema\n\n")

        fl.write(f"class {self.name}(TypedDict):\n")
        fl.write('    """Dataset"""\n\n')
        for item in filter(lambda i: i.name not in ds_items, _items):
            fl.write(f"    {item.name}: {item.name}\n")
        for ds in _datasets:
            fl.write(f"    {ds.name}: {ds.name}\n")
        fl.write('\n')

from_schema classmethod

from_schema(
    schema: Path | str,
    out_loc: Path | str,
    gdb_name: str,
    *,
    remove_rules: bool = False,
) -> Dataset[Any]

Create a GDB from a schema file (xlsx, json, xml) generated by export_schema

PARAMETER DESCRIPTION

schema

Path to the schema file

TYPE: Path | str

out_loc

Path to the GDB output directory

TYPE: Path | str

gdb_name

The name of the gdb

TYPE: str

remove_rules

Don't import Attribute Rules after building the new dataset (default: False)

TYPE: bool DEFAULT: False

Usage
>>> ds = Dataset.from_schema('schema.xlsx', 'out_dir', 'new_db.gdb', skip_rules=True)
... # This can take a while depending on the size of the schema
>>> ds
Dataset('new_db' {'Features': 10, 'Tables': 3, Datasets: 0})
Source code in src/arcpie/database.py
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
@classmethod
def from_schema(cls, schema: Path|str, out_loc: Path|str, gdb_name: str, 
                *,
                remove_rules: bool=False) -> Dataset[Any]:
    """Create a GDB from a schema file (xlsx, json, xml) generated by export_schema

    Args:
        schema (Path|str): Path to the schema file
        out_loc (Path|str): Path to the GDB output directory
        gdb_name (str): The name of the gdb
        remove_rules (bool): Don't import Attribute Rules after building the new dataset (default: False)

    Usage:
        ```python
        >>> ds = Dataset.from_schema('schema.xlsx', 'out_dir', 'new_db.gdb', skip_rules=True)
        ... # This can take a while depending on the size of the schema
        >>> ds
        Dataset('new_db' {'Features': 10, 'Tables': 3, Datasets: 0})
        ```
    """
    schema = Path(schema)
    out_loc = Path(out_loc)
    new_database = (out_loc / gdb_name).with_suffix('.gdb')

    # Convert the schema to json for easy parsing of attribute rules
    with TemporaryDirectory(f'{gdb_name}_json_schema') as temp:
        temp = Path(temp)
        # Convert the schema to json
        if not schema.suffix == '.json':
            converted_report, = ConvertSchemaReport(
                str(schema), str(temp), 'json_schema', 'JSON'
            )
        else:
            converted_report = str(schema)
        # Patch the schema doc
        workspace = patch_schema_rules(
            converted_report, remove_rules=remove_rules
        )
        # Write out to tempfile
        patched_schema = temp / 'patched_schema.json'
        patched_schema.write_text(json.dumps(workspace), encoding='utf-8')
        # Convert to importable XML
        xml_schema, = ConvertSchemaReport(
            str(patched_schema), str(temp), 'xml_schema', 'XML'
        )
        # Create a new GDB
        CreateFileGDB(str(out_loc), gdb_name, 'CURRENT')
        # Import the schema doc
        ImportXMLWorkspaceDocument(
            str(new_database), xml_schema, 'SCHEMA_ONLY'
        )
    return Dataset(new_database)

from_schema_module classmethod

from_schema_module(
    out_loc: Path | str,
    schema_module: ModuleType,
    *,
    spatial_reference: SpatialReference | WKID = WGS84,
    overwrite: bool = False,
    domain_module: ModuleType | None = None,
) -> Generator[FeatureClass | Table, None, Dataset[Any]]

Build a new GDB from an existing schema module generated with export_schema_module

PARAMETER DESCRIPTION

out_loc

Destination for the generated GDB

TYPE: Path | str

schema_module

The module containing all table definitions

TYPE: ModuleType

spatial_reference

The Spatial Reference to generate the database in (default: WGS84/EPSG:4326)

TYPE: SpatialReference | WKID DEFAULT: WGS84

overwrite

If the target database exists, overwrite it

TYPE: bool DEFAULT: False

domain_module

An optional domain module that will be used to create domains in the new dataset

TYPE: ModuleType | None DEFAULT: None

YIELDS DESCRIPTION
FeatureClass | Table

FeatureClasses/Tables as they are created for monitoring purposes

RETURNS DESCRIPTION
Dataset[Any]

A new Dataset object built from the schema

RAISES DESCRIPTION
FileExistsError

When the target directory exists and overwrite is set to False

Example

import my_database_schema new_ds = Dataset.from_schema_module('new_database.gdb', my_database_schema, 3857)

Source code in src/arcpie/database.py
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
@classmethod
def from_schema_module(cls, out_loc: Path | str, schema_module: ModuleType, 
                       *,
                       spatial_reference: SpatialReference | WKID = WGS84,
                       overwrite: bool = False,
                       domain_module: ModuleType | None = None
                       ) -> Generator[FeatureClass | Table, None, Dataset[Any]]:
    """Build a new GDB from an existing schema module generated with `export_schema_module`

    Args:
        out_loc: Destination for the generated GDB
        schema_module: The module containing all table definitions
        spatial_reference: The Spatial Reference to generate the database in (default: `WGS84`/`EPSG:4326`)
        overwrite: If the target database exists, overwrite it
        domain_module: An optional domain module that will be used to create domains in the new dataset

    Yields:
        FeatureClasses/Tables as they are created for monitoring purposes

    Returns:
        A new Dataset object built from the schema

    Raises:
        FileExistsError: When the target directory exists and `overwrite` is set to `False`

    Example:
        >>> import my_database_schema
        >>> new_ds = Dataset.from_schema_module('new_database.gdb', my_database_schema, 3857)
    """
    # Defer imports
    from .schema.field import parse_hierarchy
    from typing import is_typeddict
    schema_root = getattr(schema_module, 'SCHEMA_ROOT', None)
    if not schema_root:
        raise ValueError(
            f'A SCHEMA_ROOT global must be declared in the schema module '
            '(this is the name of last item generated by the export)'
        )

    root_dict: type | None = getattr(schema_module, schema_root, None)
    if not root_dict or not is_typeddict(root_dict):
        raise ValueError('SCHEMA_ROOT must be a TypedDict')
    if not (root_dict.__doc__ or '').startswith('Dataset'):
        raise ValueError('SCHEMA_ROOT must have a docstring with `Dataset` as the first line')


    out_loc = Path(out_loc)
    if out_loc.suffix != '.gdb':
        # Enforce '.gdb' suffix, and allow other suffixes:
        # e.g. ../my_database.new -> ../my_database.new.gdb
        out_loc = out_loc.with_suffix(out_loc.suffix + '.gdb')
    if out_loc.exists():
        if not overwrite:
            raise FileExistsError(
                f'{out_loc} Exists! '
                'To overwrite it, set the `overwrite` flag to True'
            )
        else:
            # Import rmtree to simplify gdb directory removal
            from shutil import rmtree
            rmtree(out_loc)

    # Create and bind the GDB
    CreateFileGDB(str(out_loc.parent), out_loc.name, 'CURRENT')
    ds = cls(out_loc)

    if domain_module:
        ds.domains.import_domains(getattr(domain_module, 'DOMAINS'), overwrite=overwrite)

    hierarchy = parse_hierarchy(root_dict, skip_annos=True)
    for child_name, child_def in hierarchy.items():
        child_def: tuple[GeoType | None, dict[str, Field]] | dict[str, Any]

        # Build FeatureClasses/Tables
        if isinstance(child_def, tuple):
            shape_type, fields = child_def
            if shape_type is None:
                table = ds.create_table(child_name)
                for field_name, field_props in fields.items():
                    if field_name.lower() in SYSTEM_FIELDS:
                        continue
                    if field_name == 'GlobalID':
                        table.add_globalids()
                        continue
                    try:
                        table.add_field(field_name, **field_props)
                    except Exception as e:
                        print(f'{field_name}: ', e)
                yield table
            else:
                fc = ds.create_featureclass(child_name, geometry_type=shape_type, spatial_reference=spatial_reference)
                for field_name, field_props in fields.items():
                    if field_name.lower() in SYSTEM_FIELDS:
                        continue
                    if field_name == 'GlobalID':
                        fc.add_globalids()
                        continue
                    try:
                        fc.add_field(field_name, **field_props)
                    except Exception as e:
                        print(f'{field_name}: ', e)
                yield fc

        # Parse FeatureDataset
        if isinstance(child_def, dict):
            ds_name = child_name
            ds.create_feature_dataset(ds_name, spatial_reference=spatial_reference)
            for fc_name, fc_def in child_def.items():
                fc_def: tuple[GeoType | None, dict[str, Field]]
                shape_type, fields = fc_def
                fc = ds.create_featureclass(fc_name, shape_type, ds_name, _ensure_dataset=False)
                for field_name, field_props in fields.items():
                    if field_name.lower() in SYSTEM_FIELDS:
                        continue
                    if field_name == 'GlobalID':
                        fc.add_globalids()
                        continue
                    try:
                        fc.add_field(field_name, **field_props)
                    except Exception as e:
                        print(f'{field_name}: ', e)
                yield fc
    return ds

import_rules

import_rules(
    rule_dir: Path | str, *, skip_fail: Literal[True]
) -> Iterator[AttributeRule | Exception]
import_rules(
    rule_dir: Path | str, *, skip_fail: Literal[False]
) -> Iterator[AttributeRule]
import_rules(
    rule_dir: Path | str,
    *,
    skip_fail: Literal[False] = False,
) -> Iterator[AttributeRule]
import_rules(
    rule_dir: Path | str, *, skip_fail: bool = False
) -> Iterator[AttributeRule | Exception]

Import Attribute rules for the dataset from a directory

PARAMETER DESCRIPTION

rule_dir

A directory containing rules in feature sub directories

TYPE: Path | str

skip_fail

Skip any attribute rule imports that fail (whole FC) (default: False)

TYPE: bool DEFAULT: False

Usage
>>> # Transfer rules from one dataset to another
>>> ds.export_rules('my_rules')
>>> ds2.import_rules('my_rules')
Source code in src/arcpie/database.py
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
def import_rules(self, rule_dir: Path | str, 
                 *, 
                 skip_fail: bool=False) -> Iterator[AttributeRule | Exception]:
    """Import Attribute rules for the dataset from a directory

    Args:
        rule_dir (Path|str): A directory containing rules in feature sub directories
        skip_fail (bool): Skip any attribute rule imports that fail (whole FC) (default: False)

    Usage:
        ```python
        >>> # Transfer rules from one dataset to another
        >>> ds.export_rules('my_rules')
        >>> ds2.import_rules('my_rules')
        ```
    """
    rule_dir = Path(rule_dir)
    for feature_class in self.feature_classes.values():
            if not (rule_dir / feature_class.name).exists():
                continue
            try:
                yield from feature_class.attribute_rules.import_rules(rule_dir / feature_class.name)
            except Exception as e:
                if skip_fail:
                    print(f'Failed to import rules for {feature_class.name}: \n\t{e.__notes__}\n\t{e}')
                    yield e
                else:
                    raise e

walk

walk(
    *, _method: Literal["sync", "threaded", "raw"] = "raw"
) -> None

Traverse the connection/path using arcpy.da.Walk and discover all dataset children

Note

This is called on dataset initialization and can take some time. Larger datasets can take up to a second or more to initialize.

Note

If the contents of a dataset change during its lifetime, you may need to call walk again. All children that are already initialized will be skipped and only new children will be initialized

Source code in src/arcpie/database.py
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
def walk(self, *, _method: Literal['sync', 'threaded', 'raw'] = 'raw') -> None:
    """Traverse the connection/path using `arcpy.da.Walk` and discover all dataset children

    Note:
        This is called on dataset initialization and can take some time. Larger datasets can take up to
        a second or more to initialize.

    Note:
        If the contents of a dataset change during its lifetime, you may need to call walk again. All 
        children that are already initialized will be skipped and only new children will be initialized
    """
    features = ['FeatureClass', 'Table', 'RelationshipClass', 'FeatureDataset']

    if _method == 'sync':
        datatypes = _extract_types_sync(self.conn, features)
    elif _method == 'threaded':
        datatypes = _extract_types_threaded(self.conn, features)
    elif _method == 'raw':
        datatypes = _extract_types_a00000004(self.conn, features)
    else:
        raise ValueError(f"Invalid walk method '{_method}', must be one of ['sync', 'threaded', 'raw']")

    self._feature_classes = {path.name: FeatureClass(path) for path in datatypes['FeatureClass']}
    self._tables = {path.name: Table(path) for path in datatypes['Table']}
    self._relationships = {path.name: Relationship(self.parent or self, path) for path in datatypes['RelationshipClass']}
    self._datasets = {path.name: Dataset(path, parent=self) for path in datatypes['FeatureDataset']}        

    # Special case for raw walk since we extracted annotations directly
    if _method == 'raw':
        if 'Annotation' in datatypes:
            self._annotations = {path.name: FeatureClass(path) for path in datatypes['Annotation']}
            self._feature_classes.update(self._annotations)
    else:
        # Annotations are a subtype of FeatureClass, so we need to access them differently
        with EnvManager(workspace=str(self.conn)):
            self._annotations = {
                anno: FeatureClass(self.conn / ds / anno)
                for ds in [''] + list(self._datasets) # include root
                for anno in ListFeatureClasses(feature_type='Annotation', feature_dataset=ds)
            }

    try:
        del self.schema
    except AttributeError:
        pass

Relationship

Relationship(parent: Dataset[Any], path: Path | str)
METHOD DESCRIPTION
delete

Delete the relationship

update

Update the relationship class

ATTRIBUTE DESCRIPTION
destination_keys

Mapping of destination Primary and Foreign keys

TYPE: dict[Literal['DestinationPrimary', 'DestinationForeign'], str]

destinations

Destination FeatureClass/Table objects

TYPE: list[FeatureClass | Table[Any]]

origin_keys

Mapping of origin Primary and Foreign keys

TYPE: dict[Literal['OriginPrimary', 'OriginForeign'], str]

origins

Origin FeatureClass/Table objects

TYPE: list[FeatureClass | Table[Any]]

Source code in src/arcpie/database.py
870
871
872
def __init__(self, parent: Dataset[Any], path: Path|str) -> None:
    self.parent = parent
    self.path = path

destination_keys property

destination_keys: dict[
    Literal["DestinationPrimary", "DestinationForeign"], str
]

Mapping of destination Primary and Foreign keys

destinations property

destinations: list[FeatureClass | Table[Any]]

Destination FeatureClass/Table objects

origin_keys property

origin_keys: dict[
    Literal["OriginPrimary", "OriginForeign"], str
]

Mapping of origin Primary and Foreign keys

origins property

origins: list[FeatureClass | Table[Any]]

Origin FeatureClass/Table objects

delete

delete() -> None

Delete the relationship

Source code in src/arcpie/database.py
958
959
960
def delete(self) -> None:
    """Delete the relationship"""
    Delete(str(self.path), 'RelationshipClass')

update

update(**options: Unpack[RelationshipOpts]) -> None

Update the relationship class

Source code in src/arcpie/database.py
962
963
964
965
966
967
def update(self, **options: Unpack[RelationshipOpts]) -> None:
    """Update the relationship class"""
    rel_opts = self.settings
    self.delete()
    rel_opts.update(options)
    CreateRelationshipClass(**rel_opts)

RelationshipManager

RelationshipManager(parent: Dataset[Any])
METHOD DESCRIPTION
create

Create a relationship

delete

Delete the relationship and return the settings so it can be made again

Source code in src/arcpie/database.py
971
972
def __init__(self, parent: Dataset[Any]) -> None:
    self.parent = parent

create

create(**options: Unpack[RelationshipOpts]) -> None

Create a relationship

Source code in src/arcpie/database.py
982
983
984
def create(self, **options: Unpack[RelationshipOpts]) -> None:
    """Create a relationship"""
    CreateRelationshipClass(**options)

delete

delete(name: str) -> RelationshipOpts | None

Delete the relationship and return the settings so it can be made again

Source code in src/arcpie/database.py
986
987
988
989
990
991
992
993
def delete(self, name: str) -> RelationshipOpts | None:
    """Delete the relationship and return the settings so it can be made again"""
    rel = self.get(name)
    if rel is None:
        return None
    settings = rel.settings
    rel.delete()
    return settings