Atomic Data - Best Business Practices for Product Catalog Data Structures - Part 1

October 29, 2008 09:42 by NielsenData

This is the first installment in a series that blends website architecture, data structures, and SEO marketing into a collaborative design pattern.

Designing a product catalog is one of those "better get it right" projects that any e-commerce firm faces.  When you discuss lifespans of projects, this one has the longest lifespan of them all.  Since I've been through this a couple of times, I thought I would share my thoughts and designs as I delve into yet another one.

There are a lot of political and technical pressures put on a product catalog from many departments within an organization including IT, Marketing, Executive, Operations, and particularly the "Industry Expert" within any company.  It is important to not only recognize them, but to appreciate them.  At the end of the day, almost everyone is "right" in their desires to have the catalog data serve them in a certain way.  As you put yourself in their shoes by doing a proper discovery before you start designing you should try to not only understand what they want, but why they want it.

Atomic Data

Your marketing team will call this "flexibile product information", your IT team may call this "dynamic product data", but at the end of the day, it's product data that is smashed into all of its discrete component pieces.

This is one of the first pressures that will be placed on you and you need to be prepared to deal with it properly.  It is important to understand that there is a competing struggle in any database design... Flexible vs. Fast.  If you think of a product as a construction made from legos, then the properties of those products are the individual lego pieces.  The concept of "atomicity" means that you can assemble your lego construction with Red, Blue and Green legos to make a space ship... and then you can rearrange those same Red, Blue and Green legos and build a house.

Now you've all seen the non-atomic way of building a product.  It's a row in a product table and it tends to look like this:

 

You are limited however when you decide to stock a product that has a "Sub Sub Type", or a product that only has one color, or a product that has two vendor brands on it.

You also have a design flaw where you are "numbering instances" of properties.  In this case "Color1" and "Color2" are going to cause problems for you when you want to search by "Color".

There is also a failure to properly "atomize" the data with things like "SubDept" being equal to "Ladies Apparel".

Let's compare this model to one that is fully "fourth normal" or highly "atomic".

 

Lets analyze this model.  The product is statically registered in a much abbreviated product table.  It serves now primarily as a hook that you can hang things from.  We've decided to establish all of our atomic types as "Type", "Gender", "Vendor", "Brand", and "Color".  You can see how this can be reused.  For the "Live Strong Velocity Ladies Sport Top" it makes sense that Color (to this product) "means" White and Yellow... but to other products the same property of "Color" could "mean" other colors.

You can also see the intrinsic hierarchy here that establishes "Apparel" as a "top category" over "Top" and likewise, "Top" as a parent category over "Tank Top".  This enables you to still utilize hierarchies in your product data representations while granting you also the ability to search ad-hoc through your product data in a non hierarchical manner by using the raw properties.

 I have taken an apparel data model and created a good sample of how the property to product mappings for a decent catalog could be structured:

 

This model describes the relationship between products and properties but also illustrates some of the intrinsic relationships between the properties themselves.  For example, if you mapped a City to a product, you could "infer" what State and Country relationship existed by recursing through the Property-to-Property relationships.

So... which data model is right?  The answer could likely be ... Both!  It really depends on your requirements which we will discuss in Part 2 - Best Business Practices for Product Catalog Data Structures - Speed versus Flexibility.

  

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Edge Caching Versus Dynamic Data - Best Practices for Product Catalog Data Structures - Part 2

October 29, 2008 09:34 by NielsenData

This is the second installment in a series that blends website architecture, data structures, and SEO marketing into a collaborative design pattern continuing from Part 1 - Best Business Practices for Product Catalog Data Structures - Atomic Data 

We've discussed some ways you can create highly discrete or "atomic" data for a product in the first article.  This article will delve into how to evaluate the choices involved in speed versus flexibility.

Any database administrator that works on a high volume, high production website will simply start to quiver uncontrollably however, because there are severe implications for accessing this type of data scattered throughout several tables in a production environment.  Pass him a mug of decaf and let's walk together through how we can tackle the thorny issue of speed related to product catalog data.

We can start with our sample product that we have now mapped into its discrete elements.

 

This data is fairly granular (or atomic) and is highly reusable within its domain ("Color" categorically means a similar thing to every product that is bound to it).  There are many considerations when it comes to allowing Speed to dictate your design, but I'll list some of the top ones:

  • Static Edge Presentation vs. Dynamic Source Presentation
  • Precomputation or Data Summarization
  • Staged Caching or Static Publishing

Static Edge Presentation 

Static Edge Presentation refers to the concept that data that is requested through web pages goes through many stages.  One model that many people are familiar with is the following:

 

Generally when the first hit is generated for a distinct URL, such as http://www.domainname.com/?ID=5, the Data Server generates the data needed for the page, the Origin Web Server composes the data into a functional web page, and then the Edge Cache Server distributes that origin page into its "cache" where the unique page sits in "static" for all subsequent hits.  If the page is requested from hundreds of Client PCs after that, only the Edge Cache Server responds to the request (until its cache expires).  If a single Client PC hits refresh over and over again, depending on the Client PC settings, the page is instead served from the Client PC's Browser Cache, which is a local equivalent of server-based edge caching.  This is generally one of the more advanced methods of serving high volume pages in a fast manner (and in a way that the database is impacted the least).  This is the preferred shield which allows your data structures to be a bit more complex (read slow), because at the price of the initial render, the cost per page load is mitigated by the Edge Caching.

Take a page that requires 8 seconds to load.  This is generally considered "too heavy" of a page to be used in production environments.  However, this is only the Origin Page Render cost, meaning it only "costs" this much time for the very first load of that unique page.  If all subsequent page loads only take 0.5 seconds from the Edge Cache for all subsequent hits, then averaged over the numbers of hits, you can quickly see how the page load time continues to approximate the 0.5 seconds load time overall for the page.

Another model is the Dynamic Rendered Page which is far more common to most web developers and online businesses:

 

This model demonstrates the direct nature of the requests from the Client PC, straight to the Origin Web Server (which gets its data from the Data Server).  In this model, there is generally a one-to-one relationship between the "hit" and the "data request", so the load on the database server is relatively high.  There are tricks you can use to ameliorate this, including Origin Server Caching, SQL Dependency Caching, and other methods, but most implementations use this form of dynamic page delivery.  In this case, data structures that cause delays can severely impact the performance of the application.

 Take a page now, which due to its flatter data model, only costs a 3 second load time.  Because the Edge Cache has been removed from the architecture, your average page load time is going to remain 3 seconds (the page construction happens over and over again for each hit).  While you gain some flexibility by having constantly changing data available on the page, you pay in the overall load on your servers (up to six times more costly in time than an edge cached solution), and you also are forced into a far less flexible data model to compensate for the speed requirements of live rendered pages.

Precomputation

The concept of pre-computation is based on a similar concept as caching.  This means that pretty much anything your database is going to need to "think" about, can in many cases be "pre-thunk."  The art of pre-thinking things before they are needed involves storing what's been thought out and saving it somewhere.  You also have to factor in the speed of retrieving things... some methods of storage are faster than others.

The diagram below (Self Healing Data Retrieval) shows the "layers" that a data request goes through before a page can be rendered.  It's pretty clear that the fastest way to get data to the customer is when the customer asks for a webpage that has been "pre-thunk" already and is waiting in cache at the Edge Cache (Akamai for example).  Here's where the magic happens.  If the page is not available in cache, the Edge cache forwards the request to the Webserver.  The Webserver then can not only generate the page, but it "heals" the Edge Cache by delivering the new page so any subsequent hits to the same page are now "healed" and available on the Edge Cache again.

 

This type of failover I described above cascades all the way up to the top.  In the examples above, if the Edge Cache fails, the Webserver picks up the slack.  If the Webserver fails, then the Method Farm system checks to see if it has an XML representation of the data in memory (extremely fast).  If the Method Farm doesn't have it in memory, then the Edge Net Storage picks up the slack.  If the Edge Net Storage doesn't have the data, then the Method Farm checks to see if it has it saved in a file on the hard drive (pretty fast).  If the Method Farm doesn't have it written to disk, then the SQL Server attempts to pull a static, pre-generated copy from a static table.  If the static table doesn't have the data, then the SQL server regenerates the data.  In general the failover escalation follows this model:

  1. Edge Cache Static Copy
  2. Webserver XML
  3. Method Farm Memory XML
  4. Edge Net Storage XML
  5. Method Farm XML from file
  6. SQL Server Static Record
  7. SQL Server Dynamic Generation from data

In any of these cases, each step is design to "repair" the previous caller that failed.  This ensures that over time, the vast majority of requests are being serviced by the Edge Cache Server and approaches near 100% availability. 

Static Publishing

The last method of high volume, high speed retrieval of web pages that can help reduce load on database systems is the Static Publishing technique.  This means that without waiting for for a user to request a page, the system is designed to "spit out" every single possible page and page combination that could possibly be hit and this entire pile of page data is dumped onto an edge cache somewhere.  There is certainly some value to this, particularly for legacy media archives and other non-dynamic, and non-live page data, but it's use is extremely limited in the e-Commerce arena. 

This highlights to some degree the ways in which network and publishing architecture can drive decisions of data structures in general.  If you choose a more normalized method of data structure, then you need to compensate on the performance side with effective edge caching.  If you choose a more dynamic method of page delivery, then you need to look more toward a flatter, more static form of data model that can deliver the performance that you need.  Many database administrators will tell you that the atomic data model listed above (Sample Product to Property Association) may be too normalized for high volume use, but if the data being accessed is used to serve up pages for an edge cache architecture, the negative is eliminated.

It is important to factor in all of the requirements of your web project before making final data architecture decisions, but it is important to note that deficiencies in one decision (choosing a more normalized data structure) can easily be offset in other ways (choosing edge caching over dynamic page construction).  This may give you more freedom as you make your data structure and architecture choices.

Now that you have evaluated your choices of data models and a highly normalized method is a good architectural choice for your situation, it's prudent to examine the benefits of what the data model will enable you to do.  We will examine some of these benefits in Part 3 - Best Business Practices for Product Catalog Data Structures - Customer Paths.

  

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Customer Paths - Best Business Practices for Product Catalog Data Structures - Part 3

October 29, 2008 09:31 by NielsenData

This is the third installment in a series that blends website architecture, data structures, and SEO marketing into a collaborative design pattern continuing from Part 2 - Best Business Practices for Product Catalog Data Structures - Speed vs Flexibility 

Many e-Commerce projects begin with an existing brick and mortar store that has decided to go online.  This means that certain data models and business processes can be inherited from the legacy business processes of a non-online environment. 

If you were going to open a physical, brick and mortar store, you would generally design the store based on "Customer Paths", meaning you would examine the vector that a customer would take upon entering your store so you could direct them along the shortest path (in certain cases) to where they were trying to go to find the product that they wanted.  Many websites are designed along a similar path but the application of brick and mortar strategies to websites may not be the most effective.

Take for example the concept that an apparel store is designed along the Customer Path strategy of Departments, Aisles and Shelves.  An apparel store would generally have a Ladies department, with a Shirts Aisle and a Tank Top Shelf.  It would make sense from a Customer Path perspective to have (female) customers enter, segment them by Gender as they walk to the Ladies department, further segment them by Type as they walk to the Shirts Aisle, and further segment them by Type as they scan the Tank Tops Shelf.

This seems to work in practice, but only as long as you can only have a single store.  Take a customer now that is female but instead wants the Nike Shirts section.  Your demographic segmentation Customer Path does not cater to them properly and so the Customer is forced to scan through all shelves that have Shirts in order to find the Shirts that match the Nike Brand.  You can see how relying on a fixed hierarchy limits your store planogram and structure in a very singular manner.  To experiment with alternate Customer Paths, you would be forced to do a hard store reset, or you could experiment with alternate locations... perhaps a Nike Store which would provide a Brand-based alternative for the Brand-conscious customer.

Imagine now a website where instead of a fixed store with a rigid, hierarchical structure of Departments, Aisles and Shelves, you had a completely dynamic store that could be rebuilt in an instant and individually for each customer that entered for their own, private shopping experience.  Imagine also, those fixed Aisles and Shelves full of product, which instead of sitting in fixed placements, when a Customer entered the store the entire inventory was tossed into the air, only to fall back in the precise order that the Customer wanted to see them in upon entering.  This is no fantasy in an online e-Commerce website where this type of flexibility is possible.

Let's take a look a the Customer Path options open to an e-Commerce Apparel customer:

 

If you recall the Product to Property Mapping diagram shown in Part 2 - Best Business Practices for Product Catalog Data Structures - Speed vs Flexibility, you will see some of the same Property mappings in the above diagram.  These help to illustrate the product being mapped within the data model along the Customer Preference Paths instead of a fixed hierarchical model that a traditional brick and mortar store operator might follow.

For example, a customer that may be more interested in Tour de France could be immediately segmented in a store with inventory sorted by the Event Property first.  Then, if the customer was interested in the Brand Property next, the inventory would be tailored to suit by showing Nike merchandise.  Finally as the customer settled on a Tour Property related Product with UCI Pro Tour branding, the final product match is easily found because the inventory re-sorted itself to match the preconceived desires of the newly arrived customer.

Similarly, a customer that was more interested, at the time, in Lance Armstrong and then Tank Tops and then a color selection of White, could follow the Customer Path of Player / Type / Color.

You can see how the model continues.  Take some time to evaluate your own design process when you created your categorization model for your e-Commerce storefront.  Think about the process you went through as you decided on the model and see if you were trying to adapt a brick and mortar model to one that could have been conceived with an online presence in mind from the start.  If so, this may help guide you along a fresh look at the construction of a new categorization schema for your online e-Commerce catalog.

The series continues in Part 4 - Best Business Practices for Product Catalog Data Structures - SEO Path Aliasing


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

SEO Path Aliasing - Best Business Practices for Product Catalog Data Structures - Part 4

October 29, 2008 09:28 by NielsenData

This is the fourth installment in a series that blends website architecture, data structures, and SEO marketing into a collaborative design pattern continuing from Part 3 - Best Business Practices for Product Catalog Data Structures - Customer Paths.

It may seem counterintuitive to discuss search engine optimization (SEO) techniques in the midst of a conversation about data structures, architecture diagrams and in-store plan-o-grams, but it can directly relate to your choice of data models.  As we discussed in the previous article, it is important to structure your website to conform with the needs of entering customers in a way that segments them properly so they find the things that they were searching for.  Part of this is anticipating what a customer is going to want before they enter your store. 

When dealing with search engines, there are two customers to contend with... the "Natural" search engine... and the "Paid" search engine.  These two customers are very important to understand and to distinguish and need to be treated with a deference and distinction from the "real" customers that frequent your online store.  The complexity arises to some degree because these two "customers" happen to be "ghost shoppers".  You never know when they are going to arrive and they generally float through your store much like a customer would, but they are searching for every product on every shelf in every aisle and in every department... all at the same time.  The complications continue because you want to manage what the ghost shoppers can and cannot see so they don't memorize portions of the store that you don't want reported on the search engines.  This may come across as elemental theory to an SEO expert, but in the context of blending SEO concepts, architecture and data structure modeling, it illustrates one aspect of the equation.

Imagine now that you are a search engine, whose job is to find, identify and classify billions of e-commerce pages throughout the Internet with the primary objective of finding pages that are considered "relevant."  I quote the term "relevant" because what that precisely means changes with the breeze and the whim of arcane departments of voodoo at the various search engine optimization firms.  With that said, you want to look at a natural search engine as a stream of water pouring into your website.  This stream is going to remember whatever it touches, so you want to ensure that it finds the things that you want it to see.  You also need to consider the diffusion of the stream of water as well.  Don't let the natural search engine stumble across pages like "Privacy Policy" or "Terms & Conditions" as that won't deliver any tangible benefit for you.  In similar fashion, on your landing pages you should try to structure your site so the links that are the most compelling draws for the majority of natural searching customers should be setup to receive the largest stream of natural search "attention." 

You also need to anticipate every possible combination of keywords that would be used to "land" on any given destination.  Lets take a look at the SEO Path Aliasing diagram to illustrate that:

 

We have already covered Customer Paths but sometimes the proper "path name" doesn't match an actual English phrase.  This means that the combinations of words that make sense for categorizing a mix of products may not make linear sense for a keyword search.  Our diagram above illustrates this with the green path of "Ladies / Nike".  There may not be many customers that would enter that phrase in a search, but it may be a logical progression as they navigate through a website.  This is where Aliased Paths come in.  In our example, the Aliased Path for "Ladies / Nike" could be "Ladies Nike Apparel"... sure this one is a bit of a stretch...  I'm not sure how many actually type in the word "apparel" but you'll need to work with me on this one.

You will note that this path is identified as "overridden".  In smaller e-Commerce websites, it may be a simple matter to manually go through each Customer Path and identify the possible Aliases but in far larger catalogs this quickly becomes a daunting task.  It doesn't mean that overridden Path Aliases aren't an important part of configuring your catalog categorization scheme, but you can, for the most part, rely on the auto-generated Path Aliases for many of the Customer Paths in your catalog.  Take the path "UCI Pro Tour / Tank Tops" which easily converts to an English text keyword search of "UCI Pro Tour Tank Tops". 

Note also our attempt to focus the "stream" of the natural search flow throughout the various Customer Paths.  Many search engines respond to a setting within the hyperlinks of a "NOFOLLOW".  This mechanism gives you some measure of control over which links you allow the natural search "probing" to find.  You will note how the various Customer Paths are identified as NOFOLLOW for those paths that we want the search engines to pass on as they traipse through our pages.  This poses another logistical issue in a large-scale e-Commerce website which we will address in the next segment, Part 5 - Best Business Practices for Product Catalog Data Structures - SEO Weighted Auto Mapping


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

SEO Weighted Auto Mapping - Best Business Practices for Product Catalog Data Structures - Part 5

October 29, 2008 09:25 by NielsenData

This is the fifth installment in a series that blends website architecture, data structures, and Search Engine Optimization (SEO) marketing into a collaborative design pattern continuing from Part 4 - Best Business Practices for Product Catalog Data Structures - SEO Path Aliasing.

We have discussed custom-tailoring a website's NOFOLLOW and Path Aliases to tightly tune the "stream" of natural search flow throughout your website.  By tuning what the search engines "see" you will be able to help your search engine rankings climb for the pages that you most care about.  In large scale e-Commerce platforms, it becomes an onerous task to keep up with all of these customizations.  Here is another case where your choice of an atomic data model will serve to automate some of these functions.

Let's examine the following model of SEO Weighted Auto-Mapping:

 

Here is a scenario where we assign "weights" to the various Property nodes that can be mapped to Products.  Once the weights are assigned, we can develop custom business rules that will help us "scale up" or "scale down" our Weighted Path's sensitivities to the search engines (through the use of NOFOLLOW tags).

We can roll back to the original case of a standard brick and mortar store that was the basis of our e-Business (for example).  In a traditional brick and mortar business, let's say that we determined that in general, segmenting our customers by Gender tended to be the most common and most popular means of diverting our customer traffic.  This could give us a clue on our e-Commerce website on what weight to assign to the Gender Property.  Since this Property holds primacy over the rest of the Properties in our categorization scheme, we could assign it with a high "weight" value.

Take our example above where we have decided that the Player Property is the highest ranking "Path" starting point in our categorization schema.  This is essentially because, in the cycling apparel business, Lance Armstrong (the keyword phrase) drives a significant portion of our prime traffic.  It also tends to be a highly competitive term that we would like a high search ranking for.  Additionally, it is a phrase that we would like to channel a lot of natural search traffic through, even to the exclusion of other lower performing phrases that have a significantly lower revenue opportunity.  For this, we assign the Player Property (regardless of the specific Player identified) a weight of 10.  This means that a customer that "lands" on the Lance Armstrong Player landing page who directly orders a product is defining the primary Customer Path that we are interested in promoting and that path gains a score of 10 / 1 (hop) which averages out to a 10 (no surprise).  Any links to this particular URL do NOT receive the NOFOLLOW parameter and the natural search engines will stream most of their energy through links like this.

We also have the option of defining our business logic for what rules we want to apply.  One example is how we set the threshold for NOFOLLOW parameter placements.  We have decided in the above example to set NOFOLLOW parameters on any Customer Paths that rank less than an average of 10.  Effectively we are deciding that we want the full "stream" of the natural search engines to flow through these highly weighted paths, which will tend to be very direct links through Products mapped to the Player Property.

We can layer in other business rules as well.  One business rule that we are using in the above model is the method of computing a multi-step Customer Path "weight".  In the example above, we simply decided to add the cumulative weights of all "hops" in each Customer Path and divide by the number of hops.  Take the Customer Path of "Tank Tops / Ladies / Cycling / Lance Armstrong".  Each "hop" as the customer steps through that path adds to our total and because there are four hops along the path, we divide the total (34) by the hops (4) and come up with an overall weight of 8.5.  This business rule may be subject to some review.  It seems that an alternative formula might be to reduce each hop's weight by the "distance" from the initial starting point.  This would then be 8 + (7-1) + (9-2) + (10-3) = 28 / 4 = 7.  However you decide to "compute" the average weight of any given Customer Path, it should make sense for your business while delivering some automation where possible for the NOFOLLOW mappings within your categorization scheme.

This demonstrates yet another possible use of blending the choice of data structure with your requirements for SEO initiatives.  We can explore more methods of integrating data models with Search Engine Optimization techniques in Part 6 - Best Business Practices for Product Catalog Data Structures - Search Optimization.


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Internal Site Search Optimization - Best Practices for Product Catalog Data Structures - Part 6

October 29, 2008 09:15 by NielsenData

This is the sixth installment in a series that blends website architecture, data structures, and SEO marketing into a collaborative design pattern continuing from Part 5 - Best Business Practices for Product Catalog Data Structures - SEO Weighted Auto Mapping

It turns out that Property mappings for Products can also lend a hand for search term optimization.  While this is helpful with external search engines through Customer Path definitions, it also becomes helpful with internal search tools built into the site itself.

As you are defining your internal search strategy, you have many considerations that need to be planned out.  Some aspects of internal search could include:

  • Related Keywords
  • Misspellings
  • Alternate Names
  • Competitor Keywords
  • Plurals
  • Multilingual
  • Synonyms
  • Legacy Phrases
  • Synonym Phrases

The diagram below demonstrates some uses of Property mappings dovetailed with internal search terms:

You can see in this model how we can implement various internal search strategies that are directly mapped to Properties.  This helps us because the mapping of internal search terms can be done atomically to each Property value which when mapped to Products can create an aggregate library of internal search terms for each product that is mapped.

Take the Customer Path of "Ladies / Nike".  We have decided to map Synonym terms for "Ladies" that incldues "Womens", "Hers", and "Female".  While the actual value of each synonym should be independently tested (through A/B testing or multivariate), each one of these could be used interchangeably with the term it replaces.  This helps us on the natural search engines that traverse the Customer Path and also contributes to a more effective internal search algorithm as well.  Now the Customer Path can be addressed through "Ladies Nike" and "Hers Nike" at the same time. 

In similar fashion, if a customer was looking for products that were mapped to the White Color Property, even multi-lingual terms could be interchanged such as "Blanco" and "Blanc" which opens up our search results to even more ranges of public and private search engines.

Related keywords enable us to establish corrolary alternatives to common terms, in this case one of the Nike related keywords could be "Velocity" to which the Property of Nike could be mapped.

Misspellings offer a rich range of additional keywords that can be layered onto a particular Property value, such as customers that type in "Lantz" instead of "Lance" when they are searching for Lance Armstrong apparel.  It's useful to mine the "missed" search logs as you let your internal search tool collect them so you can decide which misspellings to incorporate into your Property value mappings.

Alternate names allow you to link various other phrases that can be used interchangeably, in this example the UCI Pro Tour is also referred to as the UCI Tour.

Competitor Keywords layer in the functionality of "borrowing" a bit from the brand equity of the phrases that may be used by your competitors.  If there was a competitor that used a brand name of "Tankz" and you were selling "Tanks" as one of your products, you could affiliate their brand key phrase with your product Property mappings.

Plurals are such an easy keyword combination to miss but they are ever so common.  Because it's highly intuitive that customers will use plurals (or singulars) in everyday use, this should prioritized as one of the best targets for low hanging fruit.

There are other uses of this Property to Product mapping with alternate keyword value definitions that I haven't even thought of, but I hope that the message is clear that utilizing the Property to Product data mapping architecture can provide a high degree of flexibility and utility.  In general the use of highly atomic information that is reconstructed at will based on the needs of the customer without preventing you from implementing edge caching as your front end solution to the client.

We continue to explore how to leverage the atomic data model in Part 7 - Best Business Practices for Product Catalog Data Structures - Comparison Shopping Site Syndication.


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Comparison Search Engine Feeds - Best Practices for Web Product Catalogs - Part 7

October 29, 2008 09:10 by NielsenData

This is the seventh installment in a series that blends website architecture, data structures, and SEO marketing into a collaborative design pattern continuing from SEO Weighted Auto Mapping - Best Practices for Web Product Catalogs - Part 6.

I initially intended to stop at Part 6 but as I continued to think about the usefulness of highly atomic product catalog data for e-Commerce catalogs, I began to think about how to use that information in more ways.  Dr. Flint McLaughlin commonly teaches a theory he calls the "Marketing Experiments Optimization Methodology":

Marketing Experiments Optimization Methodology   

He obviously can explain it far better than I can, but the general rule is that your first priority is to optimize your product (and it's value proposition), followed by the presentation of that product, followed by optimizing the channels through which the product is sold.  I will certainly investigate the first two, but if you focus solely on the "Channels" aspect of this, let's think about how our Product Property data can help us optimize our Marketing Channels.  I'll revisit one of my earlier diagrams that shows the various paths that can be traveled to finally land on a product.  This concept that the customer can "come from anywhere" and is going to want complete flexibility in how they choose to travel through your website is just as valid for other websites that show your products as well.

 

Consider a model where we have completed our website, we have optimized it for the search engines, and we are generating significant traffic.  Jared's First Law of e-Commerce states (yes, I just made that up...) that it is illegal to make money on the Internet.  The United States Treasury is the only institution entitled to "make money"... so our real objective is not to "make" money... but to "divert" money from other places so we get that money.  If you think of your e-Commerce project in that light, it refocuses some of your goals.  Now that we know that our job is to divert money that people would have otherwise spent somewhere else... what are the best tactics to accomplish this?

One of the primary mechanisms for this is called Product Feed Syndication.  Syndication reminds me of Gary Larsen penning a Far Side comic strip.  He spent his time creating a quality comic strip, but he didn't call up every newspaper and ask them individually to print them in their papers.  He used a Syndicator or an Agent whose job was to do that for him.  In similar fashion, you've created your entire e-Commerce catalog with rich product information and you've published your own website containing them... but you're not the Wall Street Journal quite yet...are you?  That means that the vast majority of eyeballs that may be interested in your products are cruising through other websites than your own and you need to divert them to you.

 Syndication Product Feeds are taps into your database that compose the products, their details, and their categorizations in a format that is consumable by the various feed agents (or syndicators).  These can take several forms which I like to classify as:

  1. Comparison Search Engines
  2. Online Marketplaces
  3. Product Review Sites

This article will primarily address the Comparison Search Engines but some of the principles can apply to the others.

Comparison Search Engines serve an important role in helping you expand your Channel Footprint.  Basically, if you're the mom-and-pop shop on the corner, the Comparison Search Engines are the mall in downtown (some larger and more useful than others) and it's useful to have a presence in both places.

The challenge you have with Comparison Search Engines is they too require navigation paths.  They may call them "categorizations" or "paths" or "trees" but they map directly to the Paths described in the diagram above (Aliased Path Automapping) just like they do on your own website.... with the exception that their paths are going to differ from yours in many cases.  One good example is the path that many Comparison Search Engines use for Sports Apparel.  They start with "Apparel", then continue to "Sports" and then to "Cycling" and so on.  However, they also provide paths that start with "Sports", continue with "Cycling", and then "Apparel" which really contains the same product mix as the one before.  Many feeds have a very flat structure to them, meaning that for each product, you may be able to specify four separate "Paths" that will be submitted with that product to the Comparison Search Engines.

Because our Product Property data is atomic and weighted, we can simply query out the top four weighted paths from our database for each product and layer that into our feeds to the Comparison Search Engines.

We can also tightly couple our Property mappings to their Paths or Categorizations by dissecting their Paths into their component parts.  If a Comparison Search Engine has a root level of "Apparel", we can match that to our Property Type of "Apparel".  That can form the basis of our query then that only pulls the top four weighted paths for that product that originates with the Apparel Property Mapping.  If you really want to get aggressive on each Comparison Search Engine, then you could layer in a direct mapping from the Product Properties to each and every path variation provided by each Comparison Search Engine categorization.

As I have time, I'll try to revisit this article and distill some more useful information about the various Comparison Search Engines and how to feed each one.

Once you have mastered the Comparison Search Engine syndication feed, we can continue to leverage our Atomic Product Data in Online Marketplaces and Auction Sites - Best Practices for Web Product Catalogs - Part 8.


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Best Practices for Internet Marketing ( SEO / SEM ) - Customer Objectives - Overview

October 29, 2008 08:08 by NielsenData

Most companies compete in crowded markets in specific niches.  This document will serve as an analysis of your specific project with respect to best practices that integrate traditional and internet marketing to help you go through your current marketing plan to see which holes can be filled in your own strategy. 

The emphasis of this document is to focus on the website and the web marketing initiatives that drive the business in an effort to allow the business model to begin the transition toward more exponential growth and high profit ratio approaches.  This should be combined with effective tracking and metrics with an initial baseline so we can compare where you are now versus where you are going from month to month and year over year.

There are many ways to accomplish this, so we need to break it down into phases so we are tackling each phase in the proper order (prioritized in order of what should come first).

Product

The first aspect of the marketing analysis should always focus on the product itself.  What it is and what it does for the customer.  It is important to note that customers don’t buy “products”… They are buying what your product will "do for them".  This is an important distinction.  You are not selling a product… you are selling the freedom or the sexiness or the relief or the status that your product provides.  This "benefit" should be the key centrepoint of your marketing message.  Each other marketing message and "touch" should always come around to confirm this one, unique message.

Presentation

The presentation of the product is the second aspect that we examine.  The website should conform to match the target audience and should match the quality and tonality of the message that is being conveyed.  If you are providing happiness to an erstwhile unhappy customer, then dark and gothic coloring may not be the best color theme to convey the joy and happiness that you are providing.  If you are going to provide a corporate, high-tech benefit to a customer, then going pastel or crayola colors won't be sending the proper message.

The content and the product listings should further confirm the message in a standard way, further reinforcing the brand.  If the message is the product will deliver freedom to the customer, then product listings should show real humans enjoying the freedom with that specific product.  Alternate images or views of the product can zoom into the "features" and alternate views of the product, but the initial product view should deliver the same branding message.

Channels

Once the presentation is remedied, we should focus on the channels.  The internet can be considered a “Channel” but there are innumerable tributaries of that one “river”.  The objective of your website is to not only build your own river (by driving traffic to your website), but to also dip your ladle in everyone else’s river so you are diverting that “juice” through your own sales pipeline.  If your website is one tiny channel, there are others such as comparison shopping sites or marketplaces that can also represent your brand and your product to give your merchandise the maximum exposure.

 

Customer Objectives

Rather than thinking about what your website looks like first, we should consider what you want to get from the website as the primary consideration.  These customer objectives are goals that your website has for your various customers and demographics.  Each demographic segment may have a unique set of goals that should be thought out ahead of time.  You may want to generate a sale from a customer while you may want a prospective investor to purchase shares of stock.  You may want to get the email address of a customer while you may want an anonymous visitor to fill out a poll.  These can be summed up as the following:

·         Generate a Sale

·         Gain Critical Insight

·         Build a Relationship

·         Innovate New and Resaleable Solutions

Generate a Sale

While this goal appears to be obvious, it must be examined very specifically.  Most people think that they want sales, so they buy an ad.  Then they stop.  That is hardly a comprehensive strategy but we will focus on the sales generation piece first.

There are several activities that can lead to sales through the website.  These are:

·         Advertising

·         Direct or Return Visits

·         Referrals

Advertising

Generally when you purchase an advertisement, you want the customer to “see”, then to “land”, then to “engage”, then to “convert”.  Many websites may do the “see the ad” piece effectively, but there may not be a clear approach toward making the “landing” very focused or relevant.  Once the customer has landed on a page, there may not be clear instructions on what to do next so they are not feeling “engaged”.  Finally the conversion could be hard to follow or may be more complex than it needs to be to accomplish an objective.  Imagine a website that requires the customer to create an account before the can order a product.  While it seems logical, it intrudes on the critical path of the sales goal of "making a sale". 

There are generally two types of "customers" that consume the website content... Robots and Humans.   Many website that optimize for search engines do a good job of catering to the robots while sacrificinng the customer experience.  Other websites are focused on the customer experience while punishing that of the robots that cruise through the underlying code.  Even others simply cater to the whims of the board of directors without taking into account either the customer or the robots that are the primary target audience.

The advertising copy itself should be examined and optimized as well to maintain high relevance to the landing page while confirming the centralized branding and marketing message.

There are several forms of Advertising:

·         Email Advertising

·         Pay per Click Advertising

·         Presentation Advertising

·         Comparison Shopping Advertising

·         Marketplace Advertising

Email Advertising

Email advertising at its simplest should focus on delivering the advertising impression with the goal of getting the email reader to land on a highly relevant web page.  However, there is an opportunity to begin to pre-engage the customer before they land with good copy, personalization, and a tone that matches the brand marketing message.  Many effective email implementations use teaser paragraphs that show the first portion of the copy with a URL link so they can read the rest of the article.  Some email communications even go so far as to allow a customer to "buy now" and actually complete the conversion objective right in the email.

Pay Per Click Advertising

Paid advertising that only costs when the customer interacts with the ad is a highly effective form of paid advertising.  Metrics should be in place to ensure that when the customer clicks on the advertisement, we gather the maximum information possible about the click... where they came from... what they did... what they landed on... etc.  You can think of pay per click as the sponsored ads on the top of a Google search for example.

Presentation Advertising

Presentation advertising is a higher risk of unfocused advertising that is broadcasted to more channels than the demographic segment needs for a period off 90 days after with the various presentation advertising channels are tuned by effectiveness, and the advertising is scaled back only to those channels that are delivering results.  You can imagine a presentation ad as the graphical banners you see frequently on websites that advertise products and let you click through.

Comparison Shopping Advertising

This form of advertising consists of inserting your product database into other specialized websites that are designed to show your product (and its prices) compared against other similar, related (or identical) products.  This is a powerful form of advertising for two key reasons.  First, it increases the aggregate exposure of your products on additional channels.  Second, your product data is generally shown on high PR value webpages other than your own, with links to your product detail pages.  This will increase the PR value of the destination pages because of the high PR linking to them.

Marketplace Advertising

The only thing better than selling your product in a store is selling your product in thousands of stores.  This franchising or mass market distribution model is duplicated on the internet in the form of marketplace syndication or advertising where you send your product catalog daily to sites like amazon.com, ebay.com  or other online stores.

Direct or Return Visits

Many websites provide little incentive for a customer to "hang out" or to return again.  This is an ancient relic of brick and mortar stores that had a logistical reason for not providing chairs within their store for fear that the customer might linger.  On the internet, lingering is a desirable activity and should be encouraged. There is a lot of work that must be done to open up the functionality of websites to make them more inclusive and more of a place that will let customers come in … sit down… meet he people… and talk to us about their life situation…  Once we understand who the customer is and what they want, then we may sell you something, but the customer will feel included and will feel a part of the community and will bookmark the site or remember the URL and will return over and over again.

There are several sources for direct or return visits, including:

·         Direct Traffic (people typing in the direct website address)

·         Default Homepage (people have their default homepage set to the website)

·         RSS/Screensaver (people who have subscribed to updates within the homepage)

·         Favorite (people who have the website as one of their favorites)

Direct Traffic

Customers are encouraged to type in the website address through a variety of mechanisms.  Your company should include the website address on your business cards, billboards, print catalogs, radio advertisements, vinyl wraps on your car, etc.  These forms of branding are vastly more effective if your domain name is native to the language, easily remembered, and simple to spell without complexities like punctuation, multiple words, reliance on plurals, or phoenetically challenging words

Default Homepage

Getting a customer to set your website as a default homepage is a distinct challenge dominated by Google itself, but may make sense in certain circumstances where you exert unprecedented control over the customer's computer (in-store kiosks, employee desktops, company-provided hardware, giveaway CD Roms, customized browser software, etc).

RSS/Screensaver

RSS or ATOM syndications are xml feeds that tend to be relatively ugly until consumed by an RSS "appliance" such as the iGoogle.com homepage or the my.Yahoo.com homepage of customer accounts on other systems.  Visualization appliances can be made that will make this much prettier for the customer including screensaver applications, RSS readers, Gadgets/Gizmos/Plugins (depending on the underlying architecture), and FLASH wrappers for the RSS provided data.

Favorites

Every website should have an easy to use "add to my favorites" button or link that makes it simple for the customer to remember the website and come back.  Email reminders and communications tend to reinforce this "favorite" concept through consistent reminders with links back to the website.

Referrals

Many websites rely on referrals to gain higher popularity and rankings.  There are several methods of obtaining referrals including:

·         Natural Search Links

·         Link Exchanges

·         Media Mentions

·         Word of Mouth Referrals

Almost all websites profess to having "natural search" as a primary marketing objective.  This is the most well known form of referral advertising.  Many websites do a good job with these, but all of them can be expanded to more fully capitalize on more forms of referral advertising.  Word of mouth is likely the best referral mechanism but some websites make it very difficult to convert these types of referrals if the primary motivation of a customer is to place a phone call but the phone number is hard to find on the website.  Don't make them dig for it.

The key here to understand is that for any of the above methods to work, your website will be measured not by what other sites you link to…. But the importance of sites that link to you.  An ongoing effort should be launched to get other high ranking sites to link to your website to improve all of these referral advertising methods.

Natural Search Links

This is the most well-known method of referral advertising but the way they attempt to tackle it is fraught with misinformation.  In the old days, SEO optimization firms would try to "game" the system and trick the search engines.  This has proven to not only be short-term thinking, but the search engines have started punishing the websites that try to trick them into thinking pages are relevant when they aren't.  Getting black listed by them is a hard challenge to overcome once tagged as an abuser of the search engine formulas.  The best approach here is to be honest and up-front by making your individual pages rich with content, highly useful (with lots of relevant links and a wealth of information that contains rich keyword densities and patterns), and extremely relevant to the keywords, regions, locations, cultures, and languages of the target audience.

Link Exchanges

Many companies use a strategy called "link farming" where they pay an outside service to simply spam thousands of other websites to link to your pages, which gives an artificial and highly irrelevant boost to natural search rankings for a short period of time (until the search engine figures out that the links are completely irrelevant to each other).  This brute force method is destructive over the long term, but companies just keep up the spamming and try to stay ahead of the inevitable punishment.  This is what I call "inorganic link farming" because once the spamming stops, all of the links inherently dry up and blow away and it doesn't take on a life of its own.

The best method of approaching this is "organic link farming" where highly relevant links are solicited with the highest quality business partners (including those with a high PR ranking value).  Some easy ones are online yellow page sites that generally have a high PR value and will link to your site.  Others include .edu (educational) websites that may link to your website if you cite their research or provide materials for their research in exchange for a link.

It is helpful to encourage your customers or constituents to put links to your site on their message boards and forums (without link spamming).  This type of organic linking will continue to grow because it was relevant and others will see the relevance and will pass the link on to their own websites (or those of others).

Media Mentions

This borders on the "traditional" but it holds as true now as it did pre-internet.  You should let the media know what you are doing, particularly when you do something that has a viral quality or is outstanding or newsworthy.  It is important to understand that media outlets are "meat grinders".  Every day the media editors wake up and need to grind meat.  If you are throwing pre-written stories into the hopper each day, it's inevitable that your stories will be picked up, or at a minimum will pique the interest of a writer that may synchronize with a story that they are writing (or will prompt them to write a story).  Standardizing this method of media notification should be a standard business practics.

Word of Mouth Referrals

It may seem obvious, but you should be asking your satisfied customers to pass the message along... particularly if you have an incentive that will benefit them.  By sending a $10 coupon to a customer when they give you 10 email addresses of friends that would be interested as well or other moral, honest approaches that will reinforce your brand while spreading the message could be used with great effect.  Above all the best technique is to do a good job and take good care of the customer... particularly when you make a mistake and quickly remedy it.

Gain Critical Insight

Many companies focus so much on generating sales that they fail to do a post-mortem on client contacts and communications.  Many sites have Google analytics installed, but leave out conversion tracking.  Some implement conversion tracking without establishing goal funnels.  These are all free services that should be exploited and analyzed monthly so you can tune what is working and pour more resources into the marketing efforts that have positive return on investment (ROI).

There are several methods of gaining the insight needed from the website, including:

·         Polls / Questionnaires

·         Feedback

·         Trend Analysis

Polls

Polls provide a sort of breadcrumbed guidance that can be critical for a business without being too onerous to the client.  These little snippets  of information can be aggregated and summarized into very effective cross-sections that can provide necessary guidance on ongoing website efforts.  There is a traffic requirement here that makes any aggregate analysis effective which many websites lack.

There are several categories of Polls that could be utilized on the website (or to the benefit of the website database) including:

·         Profile Polls (polls that occur after a user has logged in)

·         Anonymous Polls (polls where you cannot correlate user responses)

·         Syndicated Poll (polls on other websites that feed into a central database)

·         Social Networks (applications within social networking spaces that deliver poll-style data)

Profile Polls

Profile polls are engineered to slowly collect information about a customer that wouldn't be appropriate to collect in the account creation stage.  You may let a customer register their email address and instead of asking them a million questions, you simply remember their email address (through cookies or microdots) and as they fill out other polls, you will be able to identify new information about them (which is subsequently added to their profile).  For example, you may let a newsletter subscriber register their email address but upon filling out a poll, you discover that their answer they gave was a typically female response to the question.  This will give a hint to their profile that the respondent is female and you can start to tune your response to that account from a female perspective (until you find out differently).

Anonymous Polls

While on the surface it may not make sense that anonymous polls would deliver rich information, they still allow you to find out what a "herd" is thinking... even if you don't have specifics.  Herd mentalities are a challenge to quantify but it's important to do numerically corrected sampling to find out what conclusions you can derive from anonymous polls.  For example, if you have ten thousand poll responses per day and only 10% of them are website complaints/issues, if you suddenly see a spike in the ratio of these anonymous complaints, it may indicate a database or edge caching failure of your underlying infrastructure that needs to be addressed.

Syndicated Poll

This a unique method of extending the reach of your polling engine to other websites.  If there are networks of sites that may benefit from a poll that you are doing, then you can syndicate the web appliance that makes the poll appear and render a response to other websites.  By getting other websites to "syndicate" your polling engine, you remain the centralized repository of information but you are now absorbing the bandwidth and reach of other websites... effectively tapping into their customer base as well as yours.

Social Networks

This relatively new form of "polling" is now prevalent on sites like MySpace, FaceBook, and others.  Even though they take the form of games ("JohnnyB just slapped you with a big red herring!"), they serve as an obtuse form of polling from which information can be gleaned.  If a social networking participant uses your FaceBook "game" to send a taunt from the Steelers to a Cowboy fan, then you have a demographic hint that the sender likes the Steelers and the recipient likes the Cowboys.  You may also be able to derive additional hints such as where they went to college, where they might live, what language they might speak, whether they are male/female, etc.

Feedback

A critical component to any business strategy and particularly websites, the feedback opportunities serve a dual purpose of gleaning non-deterministic and subjective data from customers while letting the customer feel better about having said something (whether in critique or to compliment).

There are several venues for feedback that could be employed:

·         Post Sale Survey

·         Response Forms

·         Email Responses

·         Personal Call / Visit Response

·         Anonymous Survey

Lumped in with these options are events such as seminars, trade shows, and other public events that effectively provide the forum for the above activities to take place.

Post Sale Survey

Closing a deal is hardly the end of the customer relationship.  The best way to get a customer to remember what a great experience they had with their order is to call it to their attention and ask them for feedback on how it went.  It's important to not nag them for this information by lacing the request with a coupon off their next order, or a freebie that they can get in addition to their order if they participate.  It works even better if the freebie is related to the product that they just ordered.  "Thank you for ordering the Blackberry Curve cellphone from our online store.  If you could just take the time to answer a few questions, we will send you a free ring tone for your phone!"

Response Forms

These feedback appliances are prevalent on many websites but are not utilized effectively.  Feedback forms generally take the shape of a demographic splitter that lets us know who they are (whether customers, pre-sale  prospects, advertisers, casual surfers, etc).  These should be analyzed for their relevance to the page (don't show website complaint forms on the investor information page), their ease of use (don't ask for information that you don't need), and the presentation (big input boxes, large fonts, easy to find buttons), and the payback should be there (put some time into the thank you pages and offer incentives rewarding them for "giving back" to the community).

Response forms should be integrated back into the customer profile as a method tracking the health and status of the customer.  Customers that are interested enough to comment (or even complain) are customers that you should engage with to maintain an ongoing conversation.  If they complain, ask them to help evaluate the fix and get their feedback on what you did to remedy the situation.

Email Responses

A consistent thread of email responses should be processed and solicited through the customer lifecycle.  These email responses generally are lost and are not recorded or filtered to the proper customer account.  This information is rich with guidance and information and can help you maintain a pulse on the customer relationship.  A professional approach with filtering and sorting of the inbound email responses should be implemented and used with relish.