Importing Large Forum (80 million posts, 150,000 users)

  • I'm wondering if the cli.php import tool has a way to only import new content. I.E. Only posts that were made since the last import. Importing the entire forum will literally take days (more than a couple from the looks of it), which makes it a little tricky to perform a switch over.


    Links to threads already answering this question are fine with me, but I couldn't find any myself. :)

  • As far as i know this is currently not possible, at least the importer alwasy try to import the whole content again :(

  • I was afraid of that. I can definitely see it turning into a relatively tricky thing, what with being able to not only delete posts (which is subtractive) but to also move threads from one board to another (which simply changes an existing row, in most forums), and obviously a host of other seemingly "simple" differences.


    Maybe someone will have some magical alternative method that'll help me out.


    A bit of additional information:
    - I'm importing from a replica
    - I'm importing to a separate database server that contains neither the original master nor its replica

    • Official Post

    Hi


    the included importer is not able to perform incremental imports, as it requires advanced database features to keep track of all the necessary changes that occur in the live forum. It's not sufficient to just import the new posts, as old posts might have changed or deleted in the mean time. The features are not required for the average board and may be disabled on the average web host, thus the importer does not make use of them.


    I'm sure that we are able to build an importer that suits your needs, but I would need to know more details of your community (e.g. source software) and your hosting environment before. Thus I suggest that you send me a conversation, shoot me an email (duesterhus@woltlab.com) or create a ticket to discuss this in private.