Import Process
Hourly cron triggers import (see config.php, Events.php)
Import::run() loops through all ImapConnections (ActiveRecords)
Import::handleConnection()
- gets the user and checks the user's permission to use the import
- loops through folder→space/profile pairs
Import::handleFolderRef()
- calculates and checks remaining import limit
- gets the space/profile (ContentContainer) and checks user's permission to post there
- gets the email folder (IMAP connection is established on first time) using ImapConnection::getFolder()
- loops through emails (amount limited) until limit is reached
- creates an Email object for each, gets html and plain from Email::getBody()*
- moves the email using ImapConnection::move()
- creates an EmailPost object and adds it to an array
- closes the IMAP connection using ImapConnection::close()
- loops through the posts array to save them (and counting for limit), so posts are only saved if the whole process was successful
*here the content filtering is done, see below
Content Filtering
Email::getBody() calls Email::filterHtml() for the html content
Email::filterHtml() applies filters in this order:
Calls Email::filterHtmlStatic() which:
- Uses HTMLPurifier for basic HTML filtering
- Removes 'ui-sortable-handle' class (prevents scrolling bug on mobile)
- Applies UriFilter for URL filtering
- Gets the general URL blacklist (admin setting in AdminSettings)
- Converts patterns to regex (supports wildcards)
- Filters URLs with preg_match()
Calls KeywordFilter::filter() which:
- Gets keywords from UserConfig (user setting)
- Removes most common forwarded email headers (currently German and English)
- Creates a DOMDocument and uses XPath
- Loops through all p, div, td, and table elements (backwards)
- Comments out all nodes containing each keyword (not just the first one)
Note: The keyword filter is experimental and may need review. Plain text emails are not filtered.