Data infrastructure optimization, availability & security software
Data integration & quality software
The Next Wave of technology & innovation

Expert Interview with Anton Xavier of FoodEssentials

Photo: Anton Xavier, Founder and CEO, FoodEssentials Corporation

Anton Xavier, Founder and CEO, FoodEssentials Corporation

FoodEssentials Corporation provides best of breed technologies to solve problems related to food label data for consumers, manufacturers, retailers and the government.

Where you market FoodEssentials data and analytics, how informed are prospects about the value proposition and ROI for analytics?

One of our core offerings is about solving a pain point for industry that they’ve been experiencing for a long time. We offer retailers a single data platform that services all of their needs whilst taking the pain out of collecting, updating, and maintaining a highly accurate and granular product database. In regards to ROI, the value of our data platform is first and foremost a cost minimization opportunity but also positions retailers and consequently brands to more effectively attack shopper marketing and shopper engagement initiatives.

What standards are facilitating your food label data? Conversely, what standards are lacking that would make life easier for your team?

When it comes to product data standards, we have a close partnership with the FDA, the driving force behind product labeling guidelines and regulations. Most of the standards for product label data, and in particular food is fairly well defined in regards to how it is represented on the package. However, one of our challenges is to bring standardization and uniform interpretation to all the unstructured and unregulated data that is represented on the package such as marketing claims.

There is a lot of confusion by industry, and consequently shoppers, in regards to the interpretation of this unstructured data. Hence we are leading the way in providing a framework for a standardized analysis of this data.

Considering food label data management, do you find that Master Data Management practices are the same or different when scaled up for Big Data, especially for real time, or near-real time applications?

One of the core value propositions of our data platform for industry is the fact that we offer a Master Data Management solution for their product data. There are a lot of challenges in handling vast amounts of unstructured product data, particularly if the data comes from several different sources. We help standardize this all for retailers by controlling the source, accuracy, structure, and currency of the data. One of the keys to our data at this basic level is that we offer deep granularity of what we call “product attributes” which helps our clients to query the data in an infinite number of ways. We presently have over 7,500 attributes that provide an almost limitless source of product understanding.

However, when we scale our data for “Big Data” initiatives where we might intersect our data with vast amounts of point of sales data to help our clients better understand what people are purchasing it is important for us to filter out the noise. In order to achieve this we pass our attributes via a filter we call our Key Message Engine that essentially limits the data to only what is important. One of our biggest challenges is in managing the intelligence that gets programmed in to our Key Message Engine whilst being true to letting the data drive the insight.

If you had access to data streams from applications like Apache Storm, which could deliver information such as individualized consumer product handling, interactive customer preference selections or sensor streams, how would this change your current designs?

I’m not sure I understand this question but let me take a stab.

As I mentioned above, there is some filtering of our data structure needed when working with real time or Big Data implementations. We have therefore developed technology, our key message engine, that essentially filters our core data by analyzing context and only delivers the data that is relevant to any single instance. This provides us with the flexibility to continue to extract meaningful insight at scale in real time and big data scenarios. At a high level we remove the noise before it is produced. Using applications like Apache Storm could be integrated into our systems to enhance our engine and the complex processes we work with to better streamline our “speed to insight.”

How does FoodEssentials address possible accuracy or food safety issues with data given that much of the data appears to be supplied externally? What data quality issues are less scalable than others?

Accuracy of data has been one of our most considerable challenges. After going through many iterations and testing many different models we found that the most important way that we can ensure quality is to own the collection process. Therefore, at this stage, we physically collect all the data in our database directly from packaging of products that are in store. This is actual data that is on products that are “live” in the market place. We then focus on ensuring we keep that data up to date by recollecting it. We then leverage technology and practical solutions to digitize and QA the data for a fraction of the cost that we used to. We can now push through over 50,000 products new or updated a month, and I think we could push through several times that if we pushed ourselves.

What privacy and security issues are associated with what could be personally identifiable information contained in the LabelAPI when used for individualizing content? What practices must be shared by customer developers?

At FoodEssentials we never come into contact with any personally identifiable data. We work to user IDs, as provided by our clients, and return the insight based on that anonymous unique identifier. All of our enterprise level clients are HIPAA compliant where relevant, and then we encourage smaller customers and developers on our data to also be aware of the legal requirements for protection. Our experience is that everyone who works with our data is motivated to support the shopper and their rights to privacy.

Does ETL play a role in your data pipeline?

We tightly control the data that comes in to our core pipeline in order to control accuracy. We do this from the point of image acquisition through to re-collection and are therefore able to experience economies of scale across our acquisition network. However, we are often charged with working with variety of different data formats when working directly with retailers and their inventory lists. We use these to help guide our data collection efforts and are constantly challenged to transform their UPC’s and categorization of UPC’s in to something that works for our needs.

What big data trends are you watching most keenly?

In the retail industry, particularly in grocery there have been a lot of misapplications of “Big Data” initiatives and there is a general weariness towards “Black Box” big data solutions. As a result we strive to position ourselves from these failures by providing solutions that are open and empowering. In that way we subscribe to the trend towards small data — or probably better put, the democratization of data processing and data driven decision making. We employ big data philosophies and indeed, technologies, but we strive to ensure the end result is a transparent open platform that empowers our clients.

What features in Big Data software do you feel are lacking?

We feel that Big Data software still has the challenge of overcoming the “Black Box” reputation to garner mainstream buy-in. Software that creates a view into data that addresses the conflict between transparency and complexity will ultimately succeed and empower a new wave of mainstream insight generation. We’re very excited about companies that are bridging that gap and look forward to taking our place at that table.

Related Posts