Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Talk:Backup 2.0 - Improve XML parsing: Difference between revisions

From MoodleDocs
(New page: I moved this talk to the talk page. ''Eloy, is that second one really necessary? Can't we find a way to process the data as we parse it? Mind you, a quick back of the envelope calculatio...)
 
No edit summary
Line 1: Line 1:
I moved this talk to the talk page.
I moved this talk to the talk page.


''Eloy, is that second one really necessary? Can't we find a way to process the data as we parse it? Mind you, a quick back of the envelope calculation shows that 12.5MB of XML is extreme from one quiz, so if Method 2 is that good, perhaps we can by lazy and load each activity into memory one at a time.''--[[User:Tim Hunt|Tim Hunt]] 19:05, 1 March 2009 (CST)
''Eloy, is that second one really necessary? Can't we find a way to process the data as we parse it? Mind you, a quick back of the envelope calculation shows that 12.5MB of XML is extreme from one quiz, so if Method 2 is that good, perhaps we can by lazy and load each activity into memory one at a time.''--[[User:Tim Hunt|Tim Hunt]] 19:05, 1 March 2009 (CST)


: ''Hi Tim, I'm analysing more things than simply the memory/speed considerations of current parsing. If I started with that, it's because of current bugs preventing people to restore 1.9 courses and wanted to prospect that ASAP. For 1.9 there is no chance to change the architecture, but perhaps we could use Method 2 or Method 5 selectively when restoring quizzes. About 2.0, the more I think on it, the more I'm about to '''split''' current monolithic moodle.xml into smaller piezes. That will cause another immediate memory reduction and speedup (it's order of magnitude more efficient to process 20*1MB files than 1*20MB file). Anyway, I'm not sure if we can made the whole process of parsing and restoring a pure-SAX process, because, for example, the attempt must be created BEFORE the states and, being formal, if we follow the SAX approach, the attempt tag hasn't been closed, hence, we haven't created it. So, perhaps we'll need to continue loading 1 module into memory. Luckly Method 2 looks really nice, lightspeed, compressed enough and cheap in memory usage). Let's see how the thing evolves, thanks!'' --[[User:Eloy Lafuente (stronk7)|Eloy Lafuente (stronk7)]] 08:22, 2 March 2009 (CST)
: ''Hi Tim, I'm analysing more things than simply the memory/speed considerations of current parsing. If I started with that, it's because of current bugs preventing people to restore 1.9 courses and wanted to prospect that ASAP. For 1.9 there is no chance to change the architecture, but perhaps we could use Method 2 or Method 5 selectively when restoring quizzes. About 2.0, the more I think on it, the more I'm about to '''split''' current monolithic moodle.xml into smaller piezes. That will cause another immediate memory reduction and speedup (it's order of magnitude more efficient to process 20*1MB files than 1*20MB file). Anyway, I'm not sure if we can made the whole process of parsing and restoring a pure-SAX process, because, for example, the attempt must be created BEFORE the states and, being formal, if we follow the SAX approach, the attempt tag hasn't been closed, hence, we haven't created it. So, perhaps we'll need to continue loading 1 module into memory. Luckly Method 2 looks really nice, lightspeed, compressed enough and cheap in memory usage). Let's see how the thing evolves, thanks!'' --[[User:Eloy Lafuente (stronk7)|Eloy Lafuente (stronk7)]] 08:22, 2 March 2009 (CST)


:: The option might be to make it possible for modules choose a smaller unit into which they could be divided. So, perhaps forum could somehow say it wants to work one thread at a time, and quiz might say it wants to process one attempt at a time (after restoring the core info as the first chunk.) And perhaps for a big group wiki, you might want to restore one group at a time.
:: The option might be to make it possible for modules choose a smaller unit into which they could be divided. So, perhaps forum could somehow say it wants to work one thread at a time, and quiz might say it wants to process one attempt at a time (after restoring the core info as the first chunk.) And perhaps for a big group wiki, you might want to restore one group at a time.
:: I guess, what I am saying is, when creating the API, perhaps modules should only be able to access the data coming from the XML through an API that looks like get_recordset, rather than get_records. Then, even if at first we implement what goes on behind the API as just loading everything into memory; if this issue becomes a problem again, we will be able to do something about it in future.
:: I guess, what I am saying is, when creating the API, perhaps modules should only be able to access the data coming from the XML through an API that looks like get_recordset, rather than get_records. Then, even if at first we implement what goes on behind the API as just loading everything into memory; if this issue becomes a problem again, we will be able to do something about it in future.--[[User:Tim Hunt|Tim Hunt]] 08:55, 2 March 2009 (CST)

Revision as of 14:55, 2 March 2009

I moved this talk to the talk page.

Eloy, is that second one really necessary? Can't we find a way to process the data as we parse it? Mind you, a quick back of the envelope calculation shows that 12.5MB of XML is extreme from one quiz, so if Method 2 is that good, perhaps we can by lazy and load each activity into memory one at a time.--Tim Hunt 19:05, 1 March 2009 (CST)

Hi Tim, I'm analysing more things than simply the memory/speed considerations of current parsing. If I started with that, it's because of current bugs preventing people to restore 1.9 courses and wanted to prospect that ASAP. For 1.9 there is no chance to change the architecture, but perhaps we could use Method 2 or Method 5 selectively when restoring quizzes. About 2.0, the more I think on it, the more I'm about to split current monolithic moodle.xml into smaller piezes. That will cause another immediate memory reduction and speedup (it's order of magnitude more efficient to process 20*1MB files than 1*20MB file). Anyway, I'm not sure if we can made the whole process of parsing and restoring a pure-SAX process, because, for example, the attempt must be created BEFORE the states and, being formal, if we follow the SAX approach, the attempt tag hasn't been closed, hence, we haven't created it. So, perhaps we'll need to continue loading 1 module into memory. Luckly Method 2 looks really nice, lightspeed, compressed enough and cheap in memory usage). Let's see how the thing evolves, thanks! --Eloy Lafuente (stronk7) 08:22, 2 March 2009 (CST)
The option might be to make it possible for modules choose a smaller unit into which they could be divided. So, perhaps forum could somehow say it wants to work one thread at a time, and quiz might say it wants to process one attempt at a time (after restoring the core info as the first chunk.) And perhaps for a big group wiki, you might want to restore one group at a time.
I guess, what I am saying is, when creating the API, perhaps modules should only be able to access the data coming from the XML through an API that looks like get_recordset, rather than get_records. Then, even if at first we implement what goes on behind the API as just loading everything into memory; if this issue becomes a problem again, we will be able to do something about it in future.--Tim Hunt 08:55, 2 March 2009 (CST)