Security is built into the heart of our system architecture and organisational practices.
System Architecture
Dataro's services run entirely on Amazon Web Services, a top-tier cloud provider.
app.dataro.io
Dataro's client-facing webapp makes use of a Serverless Architecture. Specifically, AWS Amplify is used to host the static ReactJS components, API Gateway and Lambda are used to serve API requests from the frontend. Data is served from an AWS Aurora Postgres database. The database resides in private subnets in our cloud and there is no route from the internet to the instance.
Batch Processing
The majority of data processes at Dataro are run using AWS Batch. Batch is a service which initialises a transient compute environment (EC2) in order to run a containerised (Docker) program. These transient compute environments exist only so long as the process takes to run and are not accessible from the internet (however, we can establish outbound connections using a NAT Gateway). In addition to being a cost efficient way to leverage powerful computing resources as required, this has a numerous security advantages.
Data Persistence
Data is stored at rest using AWS Simple Storage Service (S3). The data is stored in private buckets encrypted using AES-256 and may only be accessed by Dataro employees using multi-factor authentication (MFA) enforced IAM profiles.Software Development
Dataro makes use of Atlassian BitBucket Pipelines and AWS Amplify to automatically build, test and deploy our software. Every change that is implemented in the system goes through an array of manual and automatic checkpoints to ensure that the security of the entire system is maintained.
- Manual review of code changes in Pull Requests (PRs) by a senior engineer or CTO
- Automated linting of code (PyLint, JSLint) to ensure code quality. Code can only be built if it meets a certain level of quality.
- Automated static analysis of code using Bandit to identify potential security issues such as SQL injection. Builds cannot be deployed if any issues are identified.
- Automated static analysis of imported OSS libraries using Safety. Builds cannot be deployed if the versions of the imported libraries have open CVEs.
- Dynamic testing of builds using unittest and Cypress.