DATA ENGINEER – RESEARCH and DEVELOPMENT “CLEARSKIES TDIR PLATFORM”

WE ARE ODYSSEY, looking for Cyber Warriors to join our journey!

As pioneers in the cybersecurity arena, our journey parallels that of the legendary Odysseus. Just as he ventured into the unknown with unwavering determination, we too navigate the ever-evolving threat landscape with an innovative and forward-thinking mindset.

Our mission is clear: to make the world a cyber safer place. If you share this determination, join our ranks and become an integral part of this journey, contributing your unique skills and perspectives to tackle impossible challenges as we build cyber-resilient futures for our clients. We firmly believe in the power of many and we promote an environment where your voice matters, learning and growth are encouraged, and innovation is rewarded.

Are you someone who thrives in the face of challenges?

Do you have a collaborative spirit, passion for innovation and a commitment to making the world a cyber safer place for all?

If so, join OUR Odyssey and make it your journey as well‘cause the beauty and reward lie in the journey and not the destination itself.

ROLE DESCRIPTION

We are looking for an experienced Data Engineer working with large language models (LLMs) to join our award winning “ClearSkies TDIR Platform” Research and Development forward-thinking team.

This role is crucial for developing and maintaining scalable data pipelines and infrastructure to support the training and deployment of large language models. The ideal candidate will bring a blend of data engineering skills and a deep understanding of the intricacies involved in managing data for LLMs and other advanced modeling from preprocessing to optimization for performance at scale.

MAIN RESPONSIBILITIES

Design, build, and maintain scalable and efficient data pipelines specifically tailored for training and deploying large language models.
Work closely with data scientists and machine learning engineers to understand data requirements for LLM projects, including data collection, processing, and storage needs.
Implement and manage data ingestion routines from a variety of sources, ensuring data quality and accessibility for LLM training.
Optimize data infrastructure to support the computational demands of LLMs, including performance tuning and scalability improvements.
Develop tools and processes for monitoring and analyzing data pipeline performance and data quality, ensuring the integrity and availability of data.
Collaborate with cross-functional teams to ensure seamless integration of LLMs into production environments, including support for model versioning, deployment, and monitoring.
Stay abreast of the latest developments in large language models, data engineering practices, and technologies to continually improve pipeline efficiency and model performance.
Ensure compliance with data governance and security policies throughout the data lifecycle, from ingestion to model deployment.

KNOWLEGDE, SKILLS AND EXPERIENCE REQUIRED

Proven experience as a Data Engineer, with specific experience working on projects involving large language models
Strong expertise in data modeling, ETL processes, and data pipeline tools
Proficient in programming languages commonly used in data engineering and machine learning, such as Python and SQL.
Experience with big data technologies (e.g., Hadoop, Spark) and cloud services (AWS, Google Cloud, Azure) tailored for machine learning and data processing workloads.
Knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes) for deploying and managing LLM applications.
Familiarity with machine learning operations (MLOps) practices for managing the lifecycle of machine learning models, including large language models.
Excellent problem-solving skills, with the ability to work independently and as part of a team in a fast-paced environment.
Strong communication skills, with the ability to explain complex technical concepts to non-technical stakeholders.

Additional Skills (Preferred, but not mandatory):

Experience with neural network optimization techniques for efficient training and inference.
Contributions to open-source projects related to large language models or data engineering.
Certifications in cloud technologies, big data, or machine learning.

Qualifications:

Bachelor's or Master's degree in Computer Science, Engineering, Information Technology, or a related field.
2+ years of proven experience as a Data Engineer

WHAT’S IN IT FOR YOU

Competitive remuneration package (according to experience and qualifications)
Opportunity to work in a highly specialized, progressive and professional setting
Hybrid and contemporary working environment, “Best Place to Work” for 3 consecutive years
13^th salary
Provident Fund
Medical and Life Insurance
Referral Scheme - You can recommend the best talents to the company and receive a reward
Half-day on Fridays
Performance based awards and bonus
Access to the latest technologies
Mentoring, training & development opportunities

Code

136

Job Location

Nicosia, Cyprus

ApPly now

World Class Professional Services

MANAGED SERVICES

ClearSkies TDIR Platform

ADVISORY SERVICES

THREAT RISK ASSESSMENT SERVICES

HYBRID INTEGRATED SOLUTIONS

NEWS & MEDIA

PRESS RELEASES

PUBLICATIONS

EVENTS

CONTENT LIBRARY

WHITEPAPERS

DATASHEETS

BROCHURES

VIDEOS

ODYSSEY BLOG

THREAT RESOURCES

ABOUT US

ACHIEVEMENTS

TECHNOLOGY PARTNERS

CONTACT US

Reach out our Team now

INTERESTED IN BECOMING
AN ODYSSEAN, SEND
US YOUR CV NOW

WHY WORK WITH US

MEET ODYSSEANS

OPEN POSITIONS

DATA ENGINEER – RESEARCH and DEVELOPMENT “CLEARSKIES TDIR PLATFORM”

APPLY NOW

Our Approach

What We Offer

Who We Are

Careers

Resources

NEWS & MEDIA

CONTENT LIBRARY

CONTACT US

INTERESTED IN BECOMING AN ODYSSEAN, SEND US YOUR CV NOW

Share

DATA ENGINEER – RESEARCH and DEVELOPMENT “CLEARSKIES TDIR PLATFORM”

APPLY NOW

Our Approach

What We Offer

Who We Are

Careers

Resources

INTERESTED IN BECOMING
AN ODYSSEAN, SEND
US YOUR CV NOW