Google Associate Data Practitioner ADP Prüfungsfragen mit Lösungen:
1. You are designing a pipeline to process data files that arrive in Cloud Storage by 3:00 am each day. Data processing is performed in stages, where the output of one stage becomes the input of the next. Each stage takes a long time to run. Occasionally a stage fails, and you have to address the problem. You need to ensure that the final output is generated as quickly as possible. What should you do?
A) Design the pipeline as a set of PTransforms in Dataflow. Restart the pipeline after correcting any stage output data errors.
B) Design the workflow as a Cloud Workflow instance. Code the workflow to jump to a given stage based on an input parameter. Rerun the workflow after correcting any stage output data errors.
C) Design the processing as a directed acyclic graph (DAG) in Cloud Composer. Clear the state of the failed task after correcting any stage output data errors.
D) Design a Spark program that runs under Dataproc. Code the program to wait for user input when an error is detected. Rerun the last action after correcting any stage output data errors.
2. You are working with a large dataset of customer reviews stored in Cloud Storage. The dataset contains several inconsistencies, such as missing values, incorrect data types, and duplicate entries. You need toclean the data to ensure that it is accurate and consistent before using it for analysis. What should you do?
A) Use the PythonOperator in Cloud Composer to clean the data and load it into BigQuery. Use SQL for analysis.
B) Use Storage Transfer Service to move the data to a different Cloud Storage bucket. Use event triggers to invoke Cloud Run functions to load the data into BigQuery. Use SQL for analysis.
C) Use BigQuery to batch load the data into BigQuery. Use SQL for cleaning and analysis.
D) Use Cloud Run functions to clean the data and load it into BigQuery. Use SQL for analysis.
3. Your organization uses Dataflow pipelines to process real-time financial transactions. You discover that one of your Dataflow jobs has failed. You need to troubleshoot the issue as quickly as possible. What should you do?
A) Set up a Cloud Monitoring dashboard to track key Dataflow metrics, such as data throughput, error rates, and resource utilization.
B) Create a custom script to periodically poll the Dataflow API for job status updates, and send email alerts if any errors are identified.
C) Use the gcloud CLI tool to retrieve job metrics and logs, and analyze them for errors and performance bottlenecks.
D) Navigate to the Dataflow Jobs page in the Google Cloud console. Use the job logs and worker logs to identify the error.
4. You are predicting customer churn for a subscription-based service. You have a 50 PB historical customer dataset in BigQuery that includes demographics, subscription information, and engagement metrics. You want to build a churn prediction model with minimal overhead. You want to follow the Google-recommended approach. What should you do?
A) Create a Looker dashboard that is connected to BigQuery. Use LookML to predict churn.
B) Use the BigQuery Python client library in a Jupyter notebook to query and preprocess the data in BigQuery. Use the CREATE MODEL statement in BigQueryML to train the churn prediction model.
C) Use Dataproc to create a Spark cluster. Use the Spark MLlib within the cluster to build the churn prediction model.
D) Export the data from BigQuery to a local machine. Use scikit- learn in a Jupyter notebook to build the churn prediction model.
5. You manage data at an ecommerce company. You have a Dataflow pipeline that processes order data from Pub/Sub, enriches the data with product information from Bigtable, and writes the processed data to BigQuery for analysis. The pipeline runs continuously and processes thousands of orders every minute. You need to monitor the pipeline's performance and be alerted if errors occur. What should you do?
A) Use Cloud Monitoring to track key metrics. Create alerting policies in Cloud Monitoring to trigger notifications when metrics exceed thresholds or when errors occur.
B) Use Cloud Logging to view the pipeline logs and check for errors. Set up alerts based on specific keywords in the logs.
C) Use the Dataflow job monitoring interface to visually inspect the pipeline graph, check for errors, and configure notifications when critical errors occur.
D) Use BigQuery to analyze the processed data in Cloud Storage and identify anomalies or inconsistencies. Set up scheduled alerts based when anomalies or inconsistencies occur.
Fragen und Antworten:
| 1. Frage Antwort: C | 2. Frage Antwort: C | 3. Frage Antwort: D | 4. Frage Antwort: B | 5. Frage Antwort: A |






1 Kundenbewertungen

