About Fusemachines
Fusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, the United States, Canada, and the Dominican Republic and more than 400 full-time employees) Fusemachines seeks to bring its global expertise in AI to transform companies around the world.
About the Role:
We are seeking a Data Scientist with hands-on Python experience and proven abilities to
support software activities in an Agile software development lifecycle. We are seeking a well- rounded developer to lead a cloud based big data application using a variety of technologies.
The ideal candidate will possess strong technical, analytical, and interpersonal skills. In
addition, the candidate will lead developers on the team to achieve architecture and design objectives as agreed with stakeholders.
This is a remote, contract-based role.
Responsibilities:
Work with developers on the team to meet product deliverables.
Coach developers on the team to develop scalable implementation.
Must have the ability to convert legacy SAS and SPSS code to Python or R-code.
Work independently and collaboratively on a multi-disciplined project team in an Agile development environment.
Contribute detailed design and architectural discussions as well as customer requirements sessions to support the implementation of code and procedures for our big data product.
Design and develop clear and maintainable code with automated open-source test functions such as Pytest, unittest, etc.
Lead developers on the team to meet product deliverables.
Ability to identify and solve for code/design optimization.
Learn and integrate with a variety of systems, APIs, and platforms.
Interact with a multi-disciplined team to clarify, analyze, and assess requirements.
Be actively involved in the design, development, and testing activities in big data applications.
Requirements:
Minimum of 3+ yrs of hands-on experience Python and Pyspark, Jupyter Notebooks,
Python environment controllers such as Poetry or PipEnv.
The ability to convert SAS and SPSS to Python.
Ability and desire to learn Julia and R-Code to convert legacy programs to Python and Spark for maintainability.
Familiarity with Databricks. Azure Databricks is a plus.
Familiarity with data cleansing, transformation, and validation.
Proven technical leadership on prior development projects.
Hands-on experience with a code versioning tool such as GitHub, Azure Devops, Bitbucket, etc.
Hands-on experience building pipelines in GitHub (or Azure Devops, Jenkins, etc.)
Hands-on experience with Spark.
Hands-on experience using Relational Databases, such as Oracle, SQL Server, MySQL, Postgres or similar.
Experience using Markdown to document code in repositories or automated documentation tools like PyDoc.
Strong written and verbal communication skills.
Self-motivated and ability to work well in a team.
Nice to Have:
Experience with data visualization tools such as Power BI or Tableau.
Experience with DEVOPS CI/CD tools and automation processes (e.g., Azure DevOPS, GitHub, BitBucket).
Containers and their environments (Docker, Podman, Docker-Compose, Kubernetes, Minikube, Kind, etc.)
Experience with Azure Cloud Services and Azure Data Factory.
Education:
Bachelor of Science degree from an accredited university
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
Powered by JazzHR