Tools

CLI

Command Line Interface(CLI) makes it easier to manage Datazone task's from your terminal.

Auth

  • datazone auth configure: Configures the CLI with the user credentials.

  • datazone auth login: Logs in the user and stores the credentials in the CLI.

Repository

  • datazone repository list: Lists all the repositories.

  • datazone repository create: Creates a new repository.

  • datazone repository deploy: Deploys the repository.

  • datazone repository summary <file_name>: Generates the summary of the repository.

  • datazone repository clone <repository_id>: Clones the repository to your current directory.

  • datazone repository pull: Pulls the repository.

Dataset

  • datazone dataset list: Lists all the datasets.

  • datazone dataset show <dataset_id> [<branch_name>] [<size>] [<transaction_id>]: Shows the dataset. Parameters are:

    • dataset_id: The id of the dataset. required

    • branch_name: The branch name of the dataset.

    • size: The size of the dataset.

    • transaction_id: The specific transaction id of the dataset.

  • datazone dataset transactions <dataset_id>: Lists all the transactions of the dataset.

View

  • datazone dataset view create: Creates a new view.

  • datazone dataset view list <dataset_id>: Lists all the views of the dataset.

  • datazone dataset view delete <view_id>: Deletes the view.

Source

  • datazone source create: Creates a new source.

  • datazone source list: Lists all the sources.

  • datazone source update <source_id>: Updates the source.

  • datazone source delete <source_id>: Deletes the source.

Extract

  • datazone extract create: Creates a new extract.

  • datazone extract list: Lists all the extracts.

  • datazone extract update <extract_id>: Updates the extract.

  • datazone extract delete <extract_id>: Deletes the extract.

  • datazone extract execute <extract_id>: Executes the extract.

Schedule

  • datazone schedule create: Creates a new schedule.

  • datazone schedule list: Lists all the schedules.

  • datazone schedule delete <source_id>: Deletes the schedule.

Execution

  • datazone execution run [--extract-id] [--pipeline-id] [<execution_type>] [<transform_selection>]: Runs the execution. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

    • execution_type: The type of the execution. Can be full or incremental.

    • transform_selection: The selection of the transforms to be executed.

  • datazone execution list [--extract-id] [--pipeline-id]: Lists all the executions. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

  • datazone execution log <execution_id>: Shows the log of the execution.

Pipeline

  • datazone pipeline create: Creates a new pipeline.

Common

  • datazone version: Shows the version of the CLI.

  • datazone info: Shows the information of the CLI.

Auth

  • datazone auth configure: Configures the CLI with the user credentials.

  • datazone auth login: Logs in the user and stores the credentials in the CLI.

Repository

  • datazone repository list: Lists all the repositories.

  • datazone repository create: Creates a new repository.

  • datazone repository deploy: Deploys the repository.

  • datazone repository summary <file_name>: Generates the summary of the repository.

  • datazone repository clone <repository_id>: Clones the repository to your current directory.

  • datazone repository pull: Pulls the repository.

Dataset

  • datazone dataset list: Lists all the datasets.

  • datazone dataset show <dataset_id> [<branch_name>] [<size>] [<transaction_id>]: Shows the dataset. Parameters are:

    • dataset_id: The id of the dataset. required

    • branch_name: The branch name of the dataset.

    • size: The size of the dataset.

    • transaction_id: The specific transaction id of the dataset.

  • datazone dataset transactions <dataset_id>: Lists all the transactions of the dataset.

View

  • datazone dataset view create: Creates a new view.

  • datazone dataset view list <dataset_id>: Lists all the views of the dataset.

  • datazone dataset view delete <view_id>: Deletes the view.

Source

  • datazone source create: Creates a new source.

  • datazone source list: Lists all the sources.

  • datazone source update <source_id>: Updates the source.

  • datazone source delete <source_id>: Deletes the source.

Extract

  • datazone extract create: Creates a new extract.

  • datazone extract list: Lists all the extracts.

  • datazone extract update <extract_id>: Updates the extract.

  • datazone extract delete <extract_id>: Deletes the extract.

  • datazone extract execute <extract_id>: Executes the extract.

Schedule

  • datazone schedule create: Creates a new schedule.

  • datazone schedule list: Lists all the schedules.

  • datazone schedule delete <source_id>: Deletes the schedule.

Execution

  • datazone execution run [--extract-id] [--pipeline-id] [<execution_type>] [<transform_selection>]: Runs the execution. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

    • execution_type: The type of the execution. Can be full or incremental.

    • transform_selection: The selection of the transforms to be executed.

  • datazone execution list [--extract-id] [--pipeline-id]: Lists all the executions. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

  • datazone execution log <execution_id>: Shows the log of the execution.

Pipeline

  • datazone pipeline create: Creates a new pipeline.

Common

  • datazone version: Shows the version of the CLI.

  • datazone info: Shows the information of the CLI.

Auth

  • datazone auth configure: Configures the CLI with the user credentials.

  • datazone auth login: Logs in the user and stores the credentials in the CLI.

Repository

  • datazone repository list: Lists all the repositories.

  • datazone repository create: Creates a new repository.

  • datazone repository deploy: Deploys the repository.

  • datazone repository summary <file_name>: Generates the summary of the repository.

  • datazone repository clone <repository_id>: Clones the repository to your current directory.

  • datazone repository pull: Pulls the repository.

Dataset

  • datazone dataset list: Lists all the datasets.

  • datazone dataset show <dataset_id> [<branch_name>] [<size>] [<transaction_id>]: Shows the dataset. Parameters are:

    • dataset_id: The id of the dataset. required

    • branch_name: The branch name of the dataset.

    • size: The size of the dataset.

    • transaction_id: The specific transaction id of the dataset.

  • datazone dataset transactions <dataset_id>: Lists all the transactions of the dataset.

View

  • datazone dataset view create: Creates a new view.

  • datazone dataset view list <dataset_id>: Lists all the views of the dataset.

  • datazone dataset view delete <view_id>: Deletes the view.

Source

  • datazone source create: Creates a new source.

  • datazone source list: Lists all the sources.

  • datazone source update <source_id>: Updates the source.

  • datazone source delete <source_id>: Deletes the source.

Extract

  • datazone extract create: Creates a new extract.

  • datazone extract list: Lists all the extracts.

  • datazone extract update <extract_id>: Updates the extract.

  • datazone extract delete <extract_id>: Deletes the extract.

  • datazone extract execute <extract_id>: Executes the extract.

Schedule

  • datazone schedule create: Creates a new schedule.

  • datazone schedule list: Lists all the schedules.

  • datazone schedule delete <source_id>: Deletes the schedule.

Execution

  • datazone execution run [--extract-id] [--pipeline-id] [<execution_type>] [<transform_selection>]: Runs the execution. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

    • execution_type: The type of the execution. Can be full or incremental.

    • transform_selection: The selection of the transforms to be executed.

  • datazone execution list [--extract-id] [--pipeline-id]: Lists all the executions. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

  • datazone execution log <execution_id>: Shows the log of the execution.

Pipeline

  • datazone pipeline create: Creates a new pipeline.

Common

  • datazone version: Shows the version of the CLI.

  • datazone info: Shows the information of the CLI.

Auth

  • datazone auth configure: Configures the CLI with the user credentials.

  • datazone auth login: Logs in the user and stores the credentials in the CLI.

Repository

  • datazone repository list: Lists all the repositories.

  • datazone repository create: Creates a new repository.

  • datazone repository deploy: Deploys the repository.

  • datazone repository summary <file_name>: Generates the summary of the repository.

  • datazone repository clone <repository_id>: Clones the repository to your current directory.

  • datazone repository pull: Pulls the repository.

Dataset

  • datazone dataset list: Lists all the datasets.

  • datazone dataset show <dataset_id> [<branch_name>] [<size>] [<transaction_id>]: Shows the dataset. Parameters are:

    • dataset_id: The id of the dataset. required

    • branch_name: The branch name of the dataset.

    • size: The size of the dataset.

    • transaction_id: The specific transaction id of the dataset.

  • datazone dataset transactions <dataset_id>: Lists all the transactions of the dataset.

View

  • datazone dataset view create: Creates a new view.

  • datazone dataset view list <dataset_id>: Lists all the views of the dataset.

  • datazone dataset view delete <view_id>: Deletes the view.

Source

  • datazone source create: Creates a new source.

  • datazone source list: Lists all the sources.

  • datazone source update <source_id>: Updates the source.

  • datazone source delete <source_id>: Deletes the source.

Extract

  • datazone extract create: Creates a new extract.

  • datazone extract list: Lists all the extracts.

  • datazone extract update <extract_id>: Updates the extract.

  • datazone extract delete <extract_id>: Deletes the extract.

  • datazone extract execute <extract_id>: Executes the extract.

Schedule

  • datazone schedule create: Creates a new schedule.

  • datazone schedule list: Lists all the schedules.

  • datazone schedule delete <source_id>: Deletes the schedule.

Execution

  • datazone execution run [--extract-id] [--pipeline-id] [<execution_type>] [<transform_selection>]: Runs the execution. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

    • execution_type: The type of the execution. Can be full or incremental.

    • transform_selection: The selection of the transforms to be executed.

  • datazone execution list [--extract-id] [--pipeline-id]: Lists all the executions. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

  • datazone execution log <execution_id>: Shows the log of the execution.

Pipeline

  • datazone pipeline create: Creates a new pipeline.

Common

  • datazone version: Shows the version of the CLI.

  • datazone info: Shows the information of the CLI.

Auth

  • datazone auth configure: Configures the CLI with the user credentials.

  • datazone auth login: Logs in the user and stores the credentials in the CLI.

Repository

  • datazone repository list: Lists all the repositories.

  • datazone repository create: Creates a new repository.

  • datazone repository deploy: Deploys the repository.

  • datazone repository summary <file_name>: Generates the summary of the repository.

  • datazone repository clone <repository_id>: Clones the repository to your current directory.

  • datazone repository pull: Pulls the repository.

Dataset

  • datazone dataset list: Lists all the datasets.

  • datazone dataset show <dataset_id> [<branch_name>] [<size>] [<transaction_id>]: Shows the dataset. Parameters are:

    • dataset_id: The id of the dataset. required

    • branch_name: The branch name of the dataset.

    • size: The size of the dataset.

    • transaction_id: The specific transaction id of the dataset.

  • datazone dataset transactions <dataset_id>: Lists all the transactions of the dataset.

View

  • datazone dataset view create: Creates a new view.

  • datazone dataset view list <dataset_id>: Lists all the views of the dataset.

  • datazone dataset view delete <view_id>: Deletes the view.

Source

  • datazone source create: Creates a new source.

  • datazone source list: Lists all the sources.

  • datazone source update <source_id>: Updates the source.

  • datazone source delete <source_id>: Deletes the source.

Extract

  • datazone extract create: Creates a new extract.

  • datazone extract list: Lists all the extracts.

  • datazone extract update <extract_id>: Updates the extract.

  • datazone extract delete <extract_id>: Deletes the extract.

  • datazone extract execute <extract_id>: Executes the extract.

Schedule

  • datazone schedule create: Creates a new schedule.

  • datazone schedule list: Lists all the schedules.

  • datazone schedule delete <source_id>: Deletes the schedule.

Execution

  • datazone execution run [--extract-id] [--pipeline-id] [<execution_type>] [<transform_selection>]: Runs the execution. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

    • execution_type: The type of the execution. Can be full or incremental.

    • transform_selection: The selection of the transforms to be executed.

  • datazone execution list [--extract-id] [--pipeline-id]: Lists all the executions. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

  • datazone execution log <execution_id>: Shows the log of the execution.

Pipeline

  • datazone pipeline create: Creates a new pipeline.

Common

  • datazone version: Shows the version of the CLI.

  • datazone info: Shows the information of the CLI.

Pyspark Examples in Transforms

© Copyright 2024. All rights reserved.

Tools

CLI

Command Line Interface(CLI) makes it easier to manage Datazone task's from your terminal.

Auth

  • datazone auth configure: Configures the CLI with the user credentials.

  • datazone auth login: Logs in the user and stores the credentials in the CLI.

Repository

  • datazone repository list: Lists all the repositories.

  • datazone repository create: Creates a new repository.

  • datazone repository deploy: Deploys the repository.

  • datazone repository summary <file_name>: Generates the summary of the repository.

  • datazone repository clone <repository_id>: Clones the repository to your current directory.

  • datazone repository pull: Pulls the repository.

Dataset

  • datazone dataset list: Lists all the datasets.

  • datazone dataset show <dataset_id> [<branch_name>] [<size>] [<transaction_id>]: Shows the dataset. Parameters are:

    • dataset_id: The id of the dataset. required

    • branch_name: The branch name of the dataset.

    • size: The size of the dataset.

    • transaction_id: The specific transaction id of the dataset.

  • datazone dataset transactions <dataset_id>: Lists all the transactions of the dataset.

View

  • datazone dataset view create: Creates a new view.

  • datazone dataset view list <dataset_id>: Lists all the views of the dataset.

  • datazone dataset view delete <view_id>: Deletes the view.

Source

  • datazone source create: Creates a new source.

  • datazone source list: Lists all the sources.

  • datazone source update <source_id>: Updates the source.

  • datazone source delete <source_id>: Deletes the source.

Extract

  • datazone extract create: Creates a new extract.

  • datazone extract list: Lists all the extracts.

  • datazone extract update <extract_id>: Updates the extract.

  • datazone extract delete <extract_id>: Deletes the extract.

  • datazone extract execute <extract_id>: Executes the extract.

Schedule

  • datazone schedule create: Creates a new schedule.

  • datazone schedule list: Lists all the schedules.

  • datazone schedule delete <source_id>: Deletes the schedule.

Execution

  • datazone execution run [--extract-id] [--pipeline-id] [<execution_type>] [<transform_selection>]: Runs the execution. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

    • execution_type: The type of the execution. Can be full or incremental.

    • transform_selection: The selection of the transforms to be executed.

  • datazone execution list [--extract-id] [--pipeline-id]: Lists all the executions. Parameters are:

    • extract_id: The id of the extract.

    • pipeline_id: The id of the pipeline.

  • datazone execution log <execution_id>: Shows the log of the execution.

Pipeline

  • datazone pipeline create: Creates a new pipeline.

Common

  • datazone version: Shows the version of the CLI.

  • datazone info: Shows the information of the CLI.

Pyspark Examples in Transforms

© Copyright 2024. All rights reserved.