You can download this code by clicking the button below.
This code is now available for download.
The code defines a function named xxx that takes two parameters: the number of rows to generate and an instance of the Luigi task. The function first checks the parameter types, then runs the task to generate data, reads the generated data, filters the data based on age, and returns the filtered data.
Technology Stack : The code uses the following packages and technologies: Luigi, pandas, and datetime.
Code Type : The type of code
Code Difficulty :
import random
import luigi
import pandas as pd
from datetime import datetime
class GenerateRandomData(luigi.Task):
def output(self):
return luigi.LocalTarget('random_data.csv')
def run(self):
n = random.randint(10, 100)
data = pd.DataFrame({
'Name': [f'Name_{i}' for i in range(n)],
'Age': [random.randint(18, 60) for _ in range(n)],
'Date': [datetime.now() for _ in range(n)]
})
data.to_csv(self.output().path, index=False)
def xxx(arg1, arg2):
# arg1 is the number of rows to generate, arg2 is the Luigi Task instance
if not isinstance(arg1, int) or not isinstance(arg2, GenerateRandomData):
raise ValueError("arg1 must be an integer and arg2 must be an instance of GenerateRandomData")
task_instance = arg2()
data_path = task_instance.output().path
# Read the generated data
data = pd.read_csv(data_path)
# Filter the data based on age
filtered_data = data[data['Age'] > arg1]
# Return the filtered data
return filtered_data