Skip to contentSkip to footer
  • Community
  • Jobs
  • Companies
  • Salaries
  • For employers
      Notifications

      Loading...

      Elevate your career

      Discover your earning potential, land dream jobs, and share work-life insights anonymously.

      employer cover photo
      employer logo
      employer logo

      ciValue

      Is this your company?

      About
      Reviews
      Pay and benefits
      Jobs
      Interviews
      Interviews
      Related searches: ciValue reviews | ciValue jobs | ciValue salaries | ciValue benefits
      ciValue interviewsciValue Data Engineer interviewsciValue interview


      Glassdoor

      • About / Press
      • Awards
      • Blog
      • Research
      • Contact Us
      • Guides

      Employers

      • Free Employer Account
      • Employer Centre
      • Employers Blog

      Information

      • Help
      • Guidelines
      • Terms of Use
      • Privacy and Ad Choices
      • Do Not Sell Or Share My Information
      • Cookie Consent Tool
      • Security

      Work With Us

      • Advertisers
      • Careers
      Download the App

      • Browse by:
      • Companies
      • Jobs
      • Locations
      • Communities
      • Recent posts

      Copyright © 2008-2026. Glassdoor LLC. "Glassdoor," "Worklife Pro," "Bowls" and logo are proprietary trademarks of Glassdoor LLC.

      Company Bowl sample

      Want the inside scoop on your own company?

      Check out your Company Bowl for anonymous work chats.

      Bowls

      Get actionable career advice tailored to you by joining more bowls.

      Followed companies

      Stay ahead in opportunities and insider tips by following your dream companies.

      Job searches

      Get personalised job recommendations and updates by starting your searches.

      Data Engineer Interview

      16 Mar 2023
      Anonymous interview candidate
      Haifa
      Declined offer
      Positive experience
      Average interview

      Application

      I applied online. The process took 3 weeks. I interviewed at ciValue (Haifa) in Mar 2023

      Interview

      Phone call with the hiring manager, technical interview on-site (about 1.5 - 2 hours), HR interview (on-site), VP R&D 1-hour interview (on-site). 30-minute VP HR meeting (on-site). Despite the thing that all the interviews have to be on-site and the lack of parking in that area, the process was fine and the people in general made a very positive impression on me. But, the overall feeling from my visits there was very depressing, The office is very small and grey, with small rooms with small desks. Though they are located in a very beautiful green area, I just felt like I have to air to breathe.

      Interview questions [4]

      Question 1

      Spark optimizations: what are the optimizations that can be done for the below snippet code: shoppers_df (customers description DF) 250MB, 15M records: schema: StructType = StructType(Array(StructFiled("shopper_id", LongType, nullable = True), StructField("retailer_id", StringType, nullable = True), StructField("shopper_group_id", StringType, nullable = True), StructField("join_date", DateType, nullable = True), StructField("shopper_type", StringType, nullable = True), StructField("gender", StringType, nullable = True))) sku_df (dimension DF): 15 MB, 90K records purchase_df (transactions DF): 50GB of parquet compressed files 5,000,000,000 records. schema: StructType = StructType(Array(StructFiled("shopper_id", LongType, nullable = True), StructField("product_id", LongType, nullable = True), StructField("pos_id", IntegerType, nullable = True), StructField("purchase_date", DateType, nullable = True), StructField("units", DoubleType, nullable = True), StructField("total_spent", DoubleType, nullable = True))) Current code: products_purchased_df = purchase_df.alias("purchase").join(shoppers_df, on = "shopper_id", how = "left outer").join(sku_df.alias("sku"), on = "product_id").select(Col("purchase.*"), Col("sku.*")) usage: status_df = products_purchased_df.groupBy(["shopper_id", "product_id"]).agg(...) Optimize join statement
      1 Answer

      Question 2

      Data Modelling: Given an input file for shoppers that should be loaded into row based DB, what is the optimized DB model (table / tables & columns) that will performs best for the following queries: 1) Get shoppers that are eligible for email & FB 2) Get shoppers that are eligible for email OR App 3) Get active shoppers (status = "A") that are NOT eligible for SMS Assumptions: there are 4 different delivery channels: e-mail, App, FB, SMS a shopper may have more than one delivery channels shopper has 2 status: A - Active or D - Disabled input data structure: +----------+-------+-------+--------+--------+--------+---------+ | id (key) | status| city | dc_1 | dc_2 | dc_3 | dc_4 | +----------+-------+--------+--------+--------+-------+---------+ |L1 | A | NY | e-mail | SMS | | | +----------+-------+--------+--------+--------+-------+---------+ |L2 | A | LA | e-mail | FB | App | | +----------+-------+--------+--------+--------+-------+---------+ |L3 | D | LA | SMS | FB | | | +----------+-------+--------+--------+--------+-------+---------+
      1 Answer

      Question 3

      Data integrity: Given transaction partition files (100 files), that are batch ingested with pipelines from storage (like S3) to a distributed DWH. What is the preferred data structure ingestion to allow data integrity? (each invoice is fixed or ingested only once). Details: - each invoice has its unique id, and each invoice contains a list of products to be added or fixed - the ingestion procedure upserts the data: update if the invoice already exists or insert if the invoice is new
      1 Answer

      Question 4

      Data Validation: Given transaction input files that are validated before the ETL process, suggest the appropriate technology and metrics to be checked in order to have seamless data integrity? Which types of data validations would you suggest for this structure? File structure: invoise_id (str) timestamp (timestamp) store_id (str) customer_id (str) product_id (str) quantity (float) purchase_spent(float) purchase_discount (float) Assumptions: file volume: 35 M records, side 5 GB transaction files can be single or multiple
      Answer question

      Top companies for "Compensation and Benefits" near you

      avatar
      Scotiabank
      3.7★Compensation and benefits
      SelfEmployed.com
      3.9★Compensation and benefits
      avatar
      UNDP
      3.7★Compensation and benefits
      avatar
      Freelancer
      3.6★Compensation and benefits