Client Overview
Development Services Group, Inc. (DSG) is a professional services firm that provides research, technical assistance, and program support to federal agencies, including the Department of State.
Project Background
Contract: Department of State Contract 19AQMM18F2561
Duration: 2018-2025 (8 years)
Initial Engagement
Slova Applications was initially hired for database administration expertise with a specific mandate to design and implement a schema for storing news content that referenced terrorist incidents globally. This database solution needed to be robust, scalable, and optimized for complex queries related to terrorism data analysis.
Expanded Scope
As the project progressed, our role expanded significantly to include:
Data Pipeline Development
We designed and built comprehensive data pipelines that:
- Ingested news content from multiple international sources
- Processed and normalized the data for consistent analysis
- Implemented validation procedures to ensure data integrity
- Created efficient storage solutions that balanced performance with accessibility
Machine Learning Classification System
Slova developed a sophisticated machine learning classifier that:
- Screened incoming content for relevance to terrorist incidents
- Applied natural language processing to analyze context and geocode incident reports
- Evaluated content against established inclusion criteria
- Reduced manual review time by automatically filtering irrelevant content
- Improved data quality by ensuring consistent application of inclusion standards
AI-Powered Question Answering
We created an AI chatbot that:
- Leveraged OpenAI’s LLM models to answer natural language questions about global terrorism
- Integrated with the published dataset to provide accurate, data-driven responses
- Enabled intuitive information access for non-technical users
- Synthesized complex data points across multiple sources
Technologies Used
- Database Technologies: AWS Aurora MySQL RDS and MS SQL Server
- Data Processing: Scalable ETL pipelines written in Python for handling diverse content sources
- Machine Learning: Natural language processing using gensim and Spacy, with classification algorithms from scikit-learn
- AI Integration: OpenAI’s LLM models for conversational intelligence
- Quality Assurance: Automated testing and validation frameworks
Outcomes
- Successfully designed and maintained a robust database schema that evolved with project requirements over 8 years
- Built reliable data pipelines that processed thousands of news items daily with minimal downtime
- Developed a machine learning system that achieved >90% accuracy in content classification
- Created an AI chatbot that provided natural language access to terrorism data, improving information discovery
- Significantly reduced manual review workload while maintaining high data quality standards
- Created documentation and knowledge transfer processes to ensure long-term sustainability