RedSieve - The Data Management Company
We extract data. We scrub data. We structure data. We stream data. We filter data. We consolidate data. We analyze data. We manage data.
Data Fusion Platform

Seamlessly merge data from disparate data sources in a variety of formats, thus giving you a holistic picture of your domain.

Data Fusion is the art of intelligently linking data from disparate data sources to create a holistic picture that serves a business scenario.

For example, a financial analyst might want to trawl the latest analyst reports for the companies in his portfolio and combine them with the companies’ latest SEC K10 filings. A product manager might wish to track customer reviews of his latest product across a range of sellers. An app developer showing real-time movie ticket availability might want to include plot summaries of movies from IMDB and reviews from Rotten Tomatoes along with movie showtimes. A healthcare analytics company might need patient information merged and reconciled across multiple health data repositories.

The data sources often don’t have to be external. It’s not uncommon for data to be in isolated silos within an enterprise. Obtaining a complete picture of their customers by linking store transactions with online browsing is a big pain point in many e-commerce companies.

While the need for data fusion is clear, the process of fusing data today is far from ideal. In many cases, it’s manual. This is, of course, error-prone and brittle. Even if the process isn’t manual, it’s typically done in an ad-hoc manner that doesn’t scale.

At RedSieve, we have extensive experience working on Data Fusion solutions across a broad set of domains - e-commerce, financial data, reputation management, identity information, health care - to name a few. Therefore, we have built the general-purpose RedSieve Data Fusion Platform from the ground up. The platform is highly scalable, effortlessly able to handle millions of data elements in realtime. The platform is easily configurable - with appropriate configurations, it can be made to work across a series of domains and use cases.

Example 1:

A recruiting company needs to construct a complete profile of potential candidates by linking data about candidates from multiple social media.

Example 2:

A online comparison shopping app would like to consolidate data about a product - prices from multiple vendors, professional reviews, specs from manufacturer sites and comments from forums.

Example 3:

An e-commerce portal with both a physical and online presence would like to learn more about their customers by merging their physical store transactions along with their digital ones.

Realtime Data Streaming

Harvest and collect events from a variety of sources in realtime. Respond to events in realtime or use the data for offline analysis and research

Want to have unparalleled visibility into how customers are using your service or product? What are they clicking on? What other actions are they taking? Which pages are they spending most time on? What are they searching for? Would you like to respond to customer actions in realtime? Or are you more interested in harvesting this data for offline analysis? The RedSieve Realtime Data Streaming Pipeline has been designed and built with these questions in mind.

Working with the pipeline is simple. The pipeline exposes an endpoint that internal systems can write events to and these are collected, organized, structured and presented in a variety of customizable formats for easy analysis. The pipeline is scalable and can easily handle billions of events per day.

E-Commerce Solutions

RedSieve provides an array of solutions to e-commerce companies in the realms of product search and discovery. catalog quality, business intelligence, data collection and analytics

Facet Extraction and Data Enhancement:

A great shopping experience begins with a great product catalog - a catalog with clean, authoritative, structured, organized information.

The RedSieve Facet Extraction Engine uses proprietary technology and algorithms to extract structured, standardized information from raw unstructured blobs of text. The engine can automatically detect synonymies - thus “blk” is the same as “black” and “14 oz” and 14.0 ounces” are the same. The engine scours the entire catalog and automatically creates and populates new attributes for each product.

With such structured facets, the search can be a lot smarter. When a customer searches for a “Vintage wooden espresso writing desk”, the most relevant product can be surfaced. Moreover a lot of these attributes can be used for “faceted searching” on the left navigation panel when search results are displayed. And more powerful recommendation algorithms that can leverage rich faceted data can be deployed.

Autocomplete done right:

A good search autocomplete solution will ensure three things: 1) A context is included along with the suggested search query - “Flower pots in Home Furnishings”, “Harry Potter in Instant Video” instead of just “Flower Pots” or “Harry Potter”. 2) Redundant suggestions will not be shown - there’s no point in suggesting “Flower Pot” and “Flower Pots”, a typical characteristic of naive solutions. 3) Using a suggestion always leads to good search results. The RedSieve Autocomplete Engine analyzes query patterns and results and provides a highly relevant and performant auto-complete feature for search.

Competitor Data Feeds:

How are you faring in comparison with your competitors? How are their prices? How rich is their product catalog? How does their assortment and inventory compare with yours? Are you carrying all variations of a given product? Are you carrying a comparable number of products in a given category?

Identification of Misclassified Products:

Are knitting needles egregiously classified under Shower Curtains and Accessories? Would a customer browsing under “Stylus and Pens” stumble upon a portable golf training aid? Are all cameras in the cameras category actually cameras? Or did a few camera tripods accidentally creep in? Our solution scours the product catalog and provides a recurring report on misclassified products.

Third Party Merchants Marketplace Integration Pipeline:

A full-fledged integration pipeline that incorporates and integrates third party merchant feeds as you work on expanding your selection and bring partners onto your platform.

An event-based metrics platform:

What are your customers mostly searching for? How many search queries lead to no results? How frequently does it happen? Is there anything common to those queries? How does a particular search query’s performance track over time? Over last one week? Last six months? Did it’s performance suddenly degrade in the last couple of days? (Perhaps someone accidentally changed the image of one of the query’s best selling products to something offensive).

Big Data Consulting

Want something custom? Just contact us!

Want something custom? Just contact us! We have built analytics solutions, deployed complex ETL processes, architected multi-stage big data pipelines, devised data integration strategies, migrated data infrastructure to the cloud to name just a few. We would love to work with you and help address your data needs.