Secteur: informatique / télécoms
Taille: Entre 20 et 100 employés
Description de l'annonce:
At SESAMm, we provide tools for the asset management industry, based on our proprietary big data, artificial intelligence and natural language processing technologies. We analyze a huge amount of unstructured textual data extracted from millions of news articles, blogs, forums and social networks in real time. We use this alternative data in combination with standard market data to provide innovative analytics on thousands of financial products across all asset classes, and to develop custom investment strategies using our internal machine learning and statistical expertise. With more than EUR 8M raised since its creation in 2014, major clients across the world, numerous awards won and an exponential team growth, we are expanding quickly in Western Europe, Americas and Asia.
Join SESAMm, an innovative and fast-growing FinTech company !
Overarching goal : you will build and scale data components to key SESAMm products, such as raw data ingestion pipeline, job scheduling and ETL design / optimization, optimize the migration the Product Data Platform toward cloud or on-premise solutions, and setup the best data development practices for other tech members.
Communicate the work of your team with weekly updates.
Key activities :
- Design and implement best data pipeline for our Text-based products (ingestion, processing, exposition) :
- Test and design state-of-the-art data ingestion pipelines
- Implement efficient streaming services
- Lead the acquisition of new data sources
- For each new data source, describe its feasibility and potential
- Integrate the new data into the datalake
- Ease the new data request engine
- Optimize current queries
- Process and integrate data in new databases or datalake
- Ensure maintainability and create update systems
- Work Experience : 2-5 years of experience in data engineering / any at-scale data processing experience.
- Good understanding of different databases and data storage technologies
- Very good knowledge of distributed computing systems, such as Spark, on a stand-alone and cluster-basis
- Good knowledge of cloud computing systems, such as AWS, GCP, Azure ML.
- Development : mastering a language within Python, Java and/or Scala at least a knowlegde with Python
- Good communication and popularization skills : understand technical team needs and issues, collaborate with several internal teams. Team player.
- Additional skills : strong interest in data science / Natural Language Processing.
- Location : Tunis
- Duration : Permanent contract / percent time : 100%