Unified Information Access Blog

Welcome to Attivio's Unified Information Access Blog. Join us for discussions on topics ranging from enterprise search solutions, information access insights, Agile software development methodology to programming with Java. We hope you'll find the articles informative and participate in the discussions by leaving a comment.

Share


We recently offered visitors to our web site the opportunity to submit their individual information access challenges saying that we would respond with a description of how we would solve their problems using the Active Intelligence Engine. So far, we have been receiving some great feedback and we plan to address all of the submissions.

Our first story was submitted by an IT Director from a multimedia company that is looking for a way to streamline the processes in which their knowledge workers perform research in order to develop new content.

Overview of the Problem - User Submitted

At our company today, a large number of personnel can be classified as knowledge workers. They work on some form of manipulated information. Without a good understanding of the information sources available within the process domain, it is difficult for knowledge workers to identify and obtain the information required to do the job.

The typical knowledge worker requires access to numerous information sources. For example, a single decision could require information from several databases, letters, and the expertise of a coworker. This means the knowledge worker needs to know where to locate the appropriate information, which databases contain relevant data, how to connect to the databases, how to query the databases, how to interpret the data and which coworker has the required expert knowledge.

The knowledge worker's ability to access this information directly affects the quality of his or her work. Because most knowledge workers have a limited amount of time to complete a process, difficulty identifying or obtaining information can, in effect, make that information unavailable. Furthermore, so much time can be spent trying to find the information that little time is left for the actual decision making.

Also, due to inexperience in a business process and its relevant information sources, a worker may overlook or misapply important information. And in many cases, the sheer amount of information available can simply exceed human information processing capabilities, having essentially the same effect as a lack of information. These problems frequently impair the timeliness and quality of the knowledge worker's output, creating an overall drag on worker productivity and effectiveness.

Details

  • Intranet Search, External/Web Search, Embedded Search

  • Documents (MS Office, PDF, XML, etc.), Databases (SQL Server, Oracle, MySql, etc.), Images (JPG, GIF, PNG, TIFF, etc.),Video (AVI, MPG, WMV, SWF, FLV, etc.), Email (Including Attachments)

  • >100m records, multiple data centers/locations, 25-50% content growth year over year

The UIA Solution


Problem 1: The typical knowledge worker requires access to numerous information sources.


The first step is to load all your content into the AIE index. This is done using "out of the box" connectors and AIE workflows.

Content Type AIE Connector Type Comment
Documents (MS Office, PDF, XML, etc.) FileConnector Simple configuration to instruct AIE where to go to pick up the files with hundreds of formats supported. An example would be to instruct AIE to go to a directory on a file server on your network and pull in all the files it finds.
Databases (SQL Server, Oracle, MySql, etc.) DBConnector Similar to how any other application connects to a database. Provide the connection path to the database and tell AIE about the data by passing a SQL query to the database. AIE will execute the query and pull all the results into the index; thereby making them searchable.
Images (JPG, GIF, PNG, TIFF, etc.) Image/OCRConnector AIE can pull in image files and extract the meta information related to each image. If desired, the results can be configured to be displayed as the original image with the option to also have the text displayed behind the image.
Video (AVI, MPG, WMV, SWF, FLV, etc.) FileConnector At present, AIE can extract information about multi-media files such as title, author, description and other metadata associated with the file.
Email (Including Attachments) FileConnector + Ingestion workflow Emails, attachments and embedded objects are all handled the same as any other file. The AIE workflows will process them just like any other file, preserving the parent-child relationship between the two.


As the content is being ingested into the AIE system a series of algorithms are performed that extract information such as people, places, things (also known as entity extraction). This process allows AIE to classify content and return better matches to queries.

It is not necessary to pre-configure any of this processing as AIE can dynamically generate dictionaries and entities based on your content as it's ingested. For example if your content is about TV programs, AIE would find things like program names, actors, affiliates, etc. as long as the information is contained in the content.

Problem 2: The knowledge worker needs to know where to locate the appropriate information, which databases contain relevant data, how to connect to the databases, how to query the databases, how to interpret the data and which coworker has the required expert knowledge.


When users search through all the available content, results are probably presented in a straight-forward manner. But it's not possible to efficiently wade through thousands of results. The key to solving this problem is ‘faceting'. You're probably familiar with facets (or dimensions as some people refer to them) - they're a staple of eCommerce sites. For example when you search for ‘camera' you get cameras but also lists of manufacturers (and the counts for each), price ranges, etc. that help you drill down to information that best matches your selection criteria. This is an incredibly useful approach in the enterprise, but it's more challenging than in a neatly organized catalog of products. You can define the facets for an eCommerce site, but for enterprise content you really need a platform that can analyze the result set and recommend the best facets that help end users understand, navigate and drill-into the content. AIE handles all of that at query time so that users are given relevant guidance based on either the content or results and then the facets help guide them as they explore the results more deeply.

For example, let's say you have ingested sports-related content from files/documents, video and a database of news stories. If you were searching on basketball, AIE could return facets based on:

  • Source = file, video, database (if more than one database is in use, it would note which one it came from)
  • Author
  • Date
  • Teams
  • Player
  • etc...

If there are very specific facets you want to always have appear, AIE lets you easily specify that as well. The important thing to remember is that AIE does not require any up-front configuration when it comes to presenting facets, they are all dynamically generated. This alleviates a lot of pressure because when you are dealing with vast amounts of content, it's very difficult to know how each and every piece of content should be classified and then keep it up to date.

AIE also lets you leverage the expertise of your workers by allowing them to add comments or tags directly to results. Those comments then would be instantly available to other users who are searching content in that index. Comments are added dynamically, without the need to resubmit the entire document or record back into the ingestion process. Tagging results users find interesting or helpful improves the precision of future searches for other users.

Problem 3: Because most knowledge workers have a limited amount of time to complete a process, difficulty identifying or obtaining information can, in effect, make that information unavailable. Furthermore, so much time can be spent trying to find the information that little time is left for the actual decision making.


What differentiates AIE from other solutions is the fact that it can ingest content from many different sources and stores it in a single, universal index, thereby making search far simpler.

Without knowing all of the requirements under this scenario, let's assume the knowledge worker has to perform a search on the intranet to find files. They then have to analyze the results and determine the relevant information. Then they have to repeat the same task against each database repository. They may have to look in an employee database to search for engineers because they may be looking for the subject matter experts based on job titles, departments or previous projects. Then they have to send emails around asking for help or if anyone else knows anything about their research. This is a costly process that can be solved by aggregating all of the content into one search index and unifying the sources.

For example, you might assume your most likely pool of experts are people who have authored a white paper and/or created a video interview, are in the research group, or have sent a lot of email regarding the topic you are researching.

The old way:

  • Search the file system for white papers.
  • Then search the various databases trying to devise the SQL query that answers your question and then sift through data even further.
  • Search through videos looking for people related to your topic.

The new Unified Information Access (UIA) way:

AIE allows you to "JOIN" your search across data ingested from database tables (say an employee database - job titles), video file metadata (author, descriptions) and white papers (documents). One single query to search across all of these sources that presents the results in a unified interface. The end user DOES NOT have to write such a query, the front-end can be configured to make that easy.

Problem 4: Due to inexperience in a business process and its relevant information sources, a worker may overlook or misapply important information.


This where the 'active' piece of AIE applies. AIE queries can be saved and used to alert users when conditions have changed or been met. Workflows can be customized to automate or drive other business process such as generating reports or publishing data to other applications.

Users can comment directly into results and have the content immediately appear in search results.

Results that are relevant to previous users' searches can be tagged or rated so they appear in future users' searches, building upon others' experiences and expertise.

Problem 5: In many cases, the sheer amount of information available can simply exceed human information processing capabilities, having essentially the same effect as a lack of information. These problems frequently impair the timeliness and quality of the knowledge worker's output, creating an overall drag on worker productivity and effectiveness.


Analysts estimate that on average knowledge workers spend 28 hours per week performing tasks that relate to finding information rather than using information in an enterprise.

When your information can be searched from a single location and the results are presented in an intelligent and organized manner, you're saving time and freeing your people up to focus on producing and analyzing content versus spending all their time searching for it.

Conclusion


Does this scenario resonate with you? Do you have similar information access problems? Send us your story and we'll show you how we can rapidly prototype a solution to your problem, using your content and without all of the headaches associated with answering endless requirements questions and sales presentations.

If we select your submission for publication and response by our UIA Experts, we'll send you a $50 Amex Gift Card*. Don't worry, your personal information and company details won't be published in the article. The gift card is simply our way of thanking you for helping us share these valuable insights with others and for helping us to build a better product. We take customer and prospect input very seriously and we are willing to demonstrate it by providing straight answers in a public forum via our blog.

Tell us your problems, we're listening!

Click here to submit your story.

*Attivio reserves the right to determine which submissions will be selected for publication.

Trackback(0)
Comments (0)add comment

Write comment
smaller | bigger

security image
Write the displayed characters


busy

Attivio on LinkedIn

 

blue-rss-icon.png

Enter your email address:

 

Articles by Date

Recent Posts

Thinking Like a Tester

As a member of what was back then, just a three-person QA team, my heart sank when I read the title of one of our early...
Read More...

What AIE and unified information access mean for developers

There has been a lot of press recently on unified information access and how it enables business users and IT staff to reduce the time it takes to provide...
Read More...

The (Real) Semantic Web Requires Machine Learning

The (Real) Semantic Web Requires Machine Learning
We think about the semantic web in two complementary (and equivalent) ways. It can be viewed as: • A large set of subject-verb-object triples, where...
Read More...

More on Triples and Graphs

More on Triples and Graphs
One of the follow-up questions I've received regarding the post on Triples...
Read More...
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8