Data Engineer

Lisboa, Lisbon, Portugal
Full Time
Mid Level

Description
Who is Defined.ai? Well, from a technical point of view, we leverage the power of a global crowd to provide some of the world’s biggest companies with the high-quality data they need to power their artificial intelligence. We’re instrumental to the progression and development of artificial intelligence and we couldn’t be prouder or more inspired to be involved in an industry that is changing the world.
From a personal point of view, we’re a group of big thinkers, high achievers and creative problem solvers. We bond over our shared love of software engineering, data science, and strong coffee. We like online gaming, running marathons, and team drinks. We celebrate authenticity and diversity and we’re invested in what we do. Our mission? World domination, obviously!


What will you do?

  • Design and implement scalable PySpark-based data pipelines to process (cleaning, validating and packaging) AI datasets;
  • Develop ETL pipelines to fuel the Operations areas with data for their analytical dashboards;
  • Set software engineering tools, platforms, and best practices while performing trade-off analysis to best match engineering, product, and project constraints and expectations.
  • Help the Product Manager in structuring, breaking down, and prioritizing the product roadmap into backlog work items
  • Collaborate with other software engineering teams such as SREs and DevOps to achieve your team’s goals
  • Work together with Software Engineering teams so as to integrate the Data Platform with other tools and platforms.


Who are we looking for?
Do you have the drive to work in an innovative and ambitious environment?

We’re looking for someone with a determined and proactive mindset, someone inspired and passionate to help us achieve our goals. Our successful candidate is a strong critical thinker, reliable and transparent, with an ability to learn and communicate. We are looking for someone special to contribute to our unique culture.

Our Data Engineer has/is: 

  • BSc or MSc in Computer Science or similar background;
  • Experience in PySpark-based data pipelines and software quality best practices;
  • Worked with Azure services such as Synapse Analytics (mainly PySpark Jobs, Pipelines, and Notebooks), ADLS, Power BI, DevOps, and SQL and NoSQL databases;
  • Solid understanding of Data Warehouse and ETL concepts, techniques, and processes;
  • Comfortable with evaluating and applying software design and architectural patterns/principles;
  • Accustomed to working with data lake and data pipeline architectures;
  • Knowledge of RESTful APIs based on FastAPI, from the provider as well as the consumer point of view;
  • Problem Solving skills;
  • Proficient in both written and spoken English.


Benefits
You spend a lot of your time at work, so it should be challenging, fun and interesting. At Defined.ai it will be all of those things and more. Here’s what we offer:

  • Flexible working schedule and hybrid model. We know comfort can boost creativity and performance, so you can manage your schedule and work both from one of our modern office spaces or home.
  • Excellent career development opportunities in a high growth company. With us, you can accomplish your career goals and follow a well-described career path with the support of your supervisor.
  • Culture of feedback and continuous improvement. AI is a fast-paced area, so we keep track of tech trends, and we always ask for feedback.
  • An international and diverse team. We have more than 30 nationalities at our 3 locations, and we provide language classes.
  • Continuous training opportunities. You can choose from many options: leveraging hand-on workshops, unlimited access to Udemy and formal development opportunities.
  • We love to have fun together. We joke a lot, and we can't imagine work without fun activities – we already surfed, raced carts and played soccer together.


About Us
Defined.ai offers a platform with multiple data delivery options that leverages machine learning technology and human intelligence to deliver quality-guaranteed training data for AI systems. The platform offers self-service and fully customizable solutions that deliver high-quality project-specific training data, enabling AI products reach market quicker. It is this business model that has allowed Defined.ai to raise a total of $63.6M in funding over 4 rounds. Our value proposition is quality, privacy, speed and scale, covering more than 50 different languages. With strong expertise in speech and natural language processing technologies, we have been serving AI companies and Fortune 500 companies since day one. Defined.ai was founded in Seattle and has offices in Lisbon and Porto.

Privacy Notice: https://defined.ai/dataset/privacy-notice-career

Share

Apply for this position

Required*
Apply with Indeed
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*