Edit

Module 3: Orchestrate and automate with a pipeline

You can complete this module in about 15 minutes. In this final part of the tutorial, you create a pipeline that orchestrates the Copy job and (optionally) the dataflow you built in Modules 1 and 2, send an email notification when all jobs complete, and schedule the pipeline to run automatically.

Prerequisites

Create a pipeline

First, create a pipeline to orchestrate the Copy job you already built.

  1. From your workspace, select + New item, then search for and choose Pipeline.

    Screenshot of the Data Factory start page with the button to create a new item and Data Pipeline selected.

  2. Provide a pipeline name. Then select Create.

Add your Copy job activity

  1. On the pipeline canvas, select the Activities tab, Copy data, then Add copy job activity.

    Screenshot of the Data Factory pipeline canvas, with the activity window open and add copy job activity selected.

  2. Select the copy job activity on the pipeline canvas, then select the Settings tab below the canvas.

    Screenshot of the pipeline canvas with the copy job activity highlighted and the settings tab highlighted.

  3. Select the Connection dropdown and select Browse all.

    Screenshot of the copy job activity settings list, with browse all highlighted.

  4. Select Copy job under New sources.

  5. On the Connect data source page, select Sign in to authenticate the connection.

    Screenshot of the get data connection credentials page, with the Sign in Option highlighted.

  6. Follow the prompts to sign in to your organizational account.

  7. Select Connect to complete the connection setup.

  8. For Workspace, select the workspace you created your Copy job in for Module 1.

  9. For Copy job, select the Copy job you created in Module 1.

Add an Office 365 Outlook activity

  1. Select the Activities tab in the pipeline editor and find the Office 365 Email activity.

    Screenshot showing the selection of the Office 365 Outlook activity from the Activities toolbar on the pipeline editor menu.

  2. Select the new Office 365 Email activity and select its Settings tab.

  3. Select the Connection dropdown list, and then select Browse all.

  4. Select Office 365 Email.

  5. Select Sign in to connect your Office 365 account.

    Screenshot showing the Pick an account dialog.

    Note

    The service doesn't currently support personal email. You must use an enterprise email address.

  6. Select Connect.

  7. Select and drag the On success path (a green checkbox on the top right side of the activity in the pipeline canvas) from your Copy job activity to your new Office 365 Email activity.

    Screenshot showing the connection of the success output from the Copy job activity to the new Office 365 Outlook activity.

  8. Select the Office 365 Email activity from the pipeline canvas, then select the Settings tab of the property area below the canvas to configure the email.

    • Enter your email address in the To section. If you want to use several addresses, use ; to separate them.
    • For the Subject, select the field so that the Add dynamic content option appears, and then select it to display the pipeline expression builder canvas.

    Screenshot showing the configuration of the Office 365 Outlook email settings tab.

  9. The Pipeline expression builder dialog appears. Enter the following expression, then select OK:

    @concat('DI in an Hour Pipeline Succeeded with Pipeline Run Id', pipeline().RunId) Screenshot showing the pipeline expression builder with the expression provided for the Subject line of the email.

  10. For the Body, select the text field and choose the View in expression builder option when it appears below the text area. Add the following expression again (with your own copy job activity name) in the Pipeline expression builder dialog that appears, then select OK: @concat('RunID = ', pipeline().RunId, ' ; ', 'Files written: ', activity('Copy job_1').output.value[0].output.filesWritten, ' ; ','Throughput: ', activity('Copy job_1').output.value[0].output.throughput,' ; ','Time to copy: ', activity('Copy job_1').output.executionDuration,' ; ','Time in queue: ', activity('Copy job_1').output.durationInQueue)

    Important

    Replace Copy job_1 with the name of your own pipeline copy job activity.

  11. Finally select the Home tab at the top of the pipeline editor, and choose Run. Then select Save and run again on the confirmation dialog to execute these activities.

    Screenshot showing the pipeline editor window with the Run button highlighted on the menu.

  12. After the pipeline runs successfully, check your email to find the confirmation email sent from the pipeline.

    Screenshot showing the pipeline status once it's complete.

    Screenshot showing the email generated by the pipeline.

(Optional) Add a Dataflow activity to the pipeline

You can also add the dataflow you created in Module 2: Create a dataflow in Data Factory into the pipeline.

  1. Hover over the green line connecting the copy job activity and the Office 365 Email activity on your pipeline canvas, and select the + button to insert a new activity.

    Screenshot showing the insert activity button for the connection between the copy job activity and the Office 365 Email activity on the pipeline canvas.

  2. Choose Dataflow from the menu that appears.

    Screenshot showing the selection of Dataflow from the insert activity menu on the pipeline canvas.

  3. The newly created Dataflow activity is inserted between the copy job activity and the Office 365 Email activity, and selected automatically, showing its properties in the area below the canvas. Select the Settings tab on the properties area, and then select your dataflow created in Module 2: Create a dataflow in Data Factory.

    Screenshot showing the Settings tab of the Dataflow activity.

Schedule pipeline execution

Once you finish developing and testing your pipeline, you can schedule it to execute automatically.

  1. On the Home tab of the pipeline editor window, select Schedule.

    A screenshot of the Schedule button on the menu of the Home tab in the pipeline editor.

  2. Select + Add schedule

  3. Configure the schedule as required. The example here schedules the pipeline to execute daily at 8:00 PM for a year.

    Screenshot showing the schedule configuration for a pipeline to run daily at 8:00 PM until the end of the year.