Massive knowledge infrastructure internship | Adaltas

Faheem

Job description

Big Data and distributed computing are on the core of Adaltas. We accompagny our companions within the deployment, upkeep, and optimization of a few of the largest clusters in France. Since lately we additionally present help for day-to-day operations.

As an important defender and lively contributor of open supply, we’re on the forefront of the info platform initiative TDP (TOSIT Data Platform).

Throughout this internship, you’ll contribute to the event of TDP, its industrialization, and the mixing of latest open supply parts and new functionalities. You may be accompanied by the Alliage expert team answerable for TDP editor help.

Additionally, you will work with the Kubernetes ecosystem and the automation of datalab deployments Onyxia, which we wish to make accessible to our prospects in addition to to college students as a part of our instructing modules (devops, large knowledge, and many others.).

Your {qualifications} will assist to increase the providers of Alliage’s open source support providing. Supported open supply parts embody TDP, Onyxia, ScyllaDB, … For individuals who wish to do some internet work along with large knowledge, we have already got a really practical intranet (ticket administration, time administration, superior search, mentions and associated articles, …) however different good options are anticipated.

You’ll apply GitOps launch chains and write articles.

You’ll work in a crew with senior advisors as mentor.

Firm presentation

Adaltas is a consulting company led by a crew of open supply specialists specializing in knowledge administration. We deploy and function the storage and computing infrastructures in collaboration with our prospects.

Companion with Cloudera and Databricks, we’re additionally open supply contributors. We invite you to browse our website and our many technical publications to be taught extra in regards to the firm.

Expertise required and to be acquired

Automating the deployment of the Onyxia datalab requires data of Kubernetes and Cloud native. You should be comfy with the Kubernetes ecosystem, the Hadoop ecosystem, and the distributed computing mannequin. You’ll grasp how the essential parts (HDFS, YARN, object storage, Kerberos, OAuth, and many others.) work collectively to fulfill the makes use of of massive knowledge.

data of utilizing Linux and the command line is required.

In the course of the internship, you’ll be taught:

  • The Kubernetes/Hadoop ecosystem with a view to contribute to the TDP venture
  • Securing clusters with Kerberos and SSL/TLS certificates
  • Excessive availability (HA) of providers
  • The distribution of sources and workloads
  • Supervision of providers and hosted purposes
  • Fault tolerant Hadoop cluster with recoverability of misplaced knowledge on infrastructure failure
  • Infrastructure as Code (IaC) through DevOps instruments similar to Ansible and Vagrant
  • Be comfy with the structure and operation of an information lakehouse
  • Code collaboration with Git, Gitlab and Github

Tasks

  • Develop into accustomed to the structure and configuration strategies of the TDP distribution
  • Deploy and take a look at safe and extremely accessible TDP clusters
  • Contribute to the TDP data base with troubleshooting guides, FAQs and articles
  • Actively contribute concepts and code to make iterative enhancements to the TDP ecosystem
  • Analysis and analyze the variations between the primary Hadoop distributions
  • Replace Adaltas Cloud utilizing Nikita
  • Contribute to the event of a device to gather buyer logs and metrics on TDP and ScyllaDB
  • Actively contribute concepts to develop our help answer

Further info

  • Location: Boulogne Billancourt, France
  • Languages: French or English
  • Beginning date: March 2023
  • Length: 6 months

A lot of the digital world runs on Open Supply software program and the Massive Knowledge business is booming. This internship is a chance to realize invaluable expertise in each domains. TDP is now the one actually Open Supply Hadoop distribution. That is the precise second to affix us. As a part of the TDP crew, you’ll have the likelihood to be taught one of many core large knowledge processing fashions and take part within the improvement and the long run roadmap of TDP. We imagine that that is an thrilling alternative and that on completion of the internship, you’ll be prepared for a profitable profession in Massive Knowledge.

Gear accessible

A laptop computer with the next traits:

  • 32GB RAM
  • 1TB SSD
  • 8c/16t CPU

A cluster made up of:

  • 3x 28c/56t Intel Xeon Scalable Gold 6132
  • 3x 192TB RAM DDR4 ECC 2666MHz
  • 3x 14 SSD 480GB SATA Intel S4500 6Gbps

A Kubernetes cluster and a Hadoop cluster.

Remuneration

  • Wage 1200 € / month
  • Restaurant tickets
  • Transportation go
  • Participation in a single worldwide convention

Prior to now, the conferences which we attended embody the KubeCon organized by the CNCF basis, the Open Source Summit from the Linux Basis and the Fosdem.

For any request for extra info and to submit your utility, please contact David Worms:

Leave a Comment