php - Best way to store "tags" for speed in enormous table -


I am developing a large content site, with the table "content", with more than 50 million records. Here is the table structure:

  ID (INT11 index), name (varchar150 FULLTEXT), description (text FULLTEXT), date (INT11 INDEX)   

I I would like to add a "tag" to this content.

I believe in 2 ways:

  1. Create a varchar (255 FULLTEXT) "Tag" column Table tags, all tags separated by commas Store, and do line searches by line (which I think will slow down) Use MATCH & AGAINS.

  2. 2 Create the table first table name "tag" column id, tag (varchar (30 INDEX or FULLTEXT?)), Id, tag_ID (int11 INDEX) and content_id (int11 INDEX) ) With "Content_tags" and 3 tables (to retrieve all content with a combined tag (s) of the contents - contents_tags - tags). I think this is slow and memory killer because 50M table * content_tag * tags a huge zone.

    Is it the best way to store tags to make it as efficient as possible? What is the fastest way to search for and find content for a lesson (for example, "Movie 3D 2011" and simple tag "video")?

    The size of the table (now 5 GB without tag) The table is a fiduciary because I need to store the list of inventory and details in the table for search and string search (users now search the CA by this field), and the best speed required to search by tags is.

    Anyone with experience in this?

    Thank you!

    The FULLTEXT index is not really fast as you can imagine that they are.

    Use a different table to store your tags:

      table tag ---------- ID integer PK tag varchar (20 ) Table Tag_Link -------------- Tag_ id integer key reference tag (id) content_id integer foreign key reference content (id) / * This table contains a PK containing tag_ id + content_id * / Table contents -------------- ID integer PK ......   

    Select all the content with tags using tags:

    SELECT c. * Insert INNER JOIN tag_link tl ON (t.id = tl.tag_id) from the tag content c (c.id = tl.content_id) WHERE tag = 'test' ORDER BY Tl.content_id DESC / * Latest content first * / LIMIT 10;

    Due to the foreign key, all the fields of the tag_link are individually indexed.
    WHERE Tag = 'test' selects the 1 (!) Record.
    This is in addition to 10,000 taglines. And each of the equinox that with 1 content record (each tag_link indicates only 1 content).
    Because of the limit 10, MySQL will stop appearing soon because it has 10 items, so it actually sees only 10 Tag_Links Records.
    content.id is autoincrementing, high numbers are very fast proxies for new articles.

    In this case you do not need to look for anything other than equality and you start with 1 tag, which you can add to the integer key Possible).

    There is no one about it either, or not, this is the fastest way.

    Note that because most are 1000 tags, the entire contents will be faster than any search in the table.

    Finally
    CSV field is a very bad idea, never use it in the database

Comments

Popular posts from this blog

mysql - BLOB/TEXT column 'value' used in key specification without a key length -

c# - Using Vici cool Storage with monodroid -

python - referencing a variable in another function? -