In programmatic advertising, ads (commonly referred to as impressions) are shown on websites with the help of DSPs (Demand Side Platforms). These platforms want to make sure that they are showing the right impressions to the right users to generate as many conversions as possible.
If a user clicks on the ad and lands on the advertiser’s website, we store this event in a table called clicks (schema shown below). If the user successfully places a deposit after landing on the advertiser’s website, we store this event in a table called conversions (schema shown below). For each conversion there exists at least one impression, but not all impressions have a conversion or a click.
You are given 3 tables with the following schemas, defined by column names and types:
impressionsimpression_id: string
url_address: string
user_id: string
request_country: string
tracking_type: string [this is the tracking type (fingerprinted or cookie-based)]
dynamic_display: boolean [this is whether the impression was served through Dynamic Display]
dynamic_display_variables: string [content served in the impression]
request_browser_name: string
timestamp: date
clicksimpression_id: int
user_id: int
timestamp: string
conversionsconversion_id: string
user_id: string
dval: integer [this is the deposit value of the conversion]
curr: string [this is the currency of the deposit value]
timestamp: date
Задание:
Using the tables defined above, provide SQL queries that answer the following questions: Please provide which SQL Dialect you have chosen: I.e., {PostgreSQL, MySQL, SQLite, or SQL Server}
Q1: What is the CTR (%) for impressions served in "Austria"?
Q2: For each converted user, find out how many impressions they were served. Specifically capturing the timestamp for the first & last impression.
Q3: What other insights could you present which the audience might find interesting from the above schemas? Do not answer this question with SQL, just write about what else could be interesting and why.
Задание #2
Дано:
Piwik is a tool used to track internal products and how new features we release are used by users. The goal is to measure the value these features bring to the team and thus company.
We have attached a file which contains data you should use for answering the below questions:
Table Schema:
page_urlThis is the URL where this event has occurredClick_textThis is the event which has been trackedUser_holding_idWhen user_role != ‘sr-admin’, this identifies the external client who is using the product and allows us to differentiate between clientsUser_roleThis allows us to identify is the user is from SR or notunixtimedateThis is the Unix Timestamp for when the event occurredSession_idThis is the Session IDTotal_eventsThis is the number of events
Задание:
Using the attached data, feel free to generate any insights you think could be interesting, below are some starting questions to help you get started
Q1: How many clients do we have based on the data above?
Q2: How many sessions do we have each week, broken down by internal & external users
Q2: How many campaigns were created successfully?
Q3: What other insights could you present which the audience might find interesting?
Q4: The Product Owner (PO) lets you know a new feature is going to be released soon and wants to track its success. How would you go about supporting this request?
Задание #3
Задание:
Please prepare a presentation of your findings for the above tasks (no longer than 10 minutes). The presentation should include both SQL & Visualizations created.