Why sqlite3 temp files were renamed 'etilqs_*' (2006)

3 years ago (github.com)

I have a related story:

Around the year 2000 I was working operations in the NOC for WebTV (then owned by Microsoft). For those who don't know, WebTV was a little set-top box with a modem which would dial up on demand and provide a very basic web/chat/email experience on the TV. The box would call a 1800 number to figure out its own phone number, then re-dial on a local toll-free number with a local sub-contracted ISP.

One of the services we had would periodically send a UDP datagram out to online clients to let them know they had new email. The settop box would then light up a little indicator light.

Of course, sometimes the client would hang up. The IP might get allocated to a PC dialup user. And sometimes, that PC dialup user might be running a firewall that was popular back then, called BLACK ICE DEFENDER.

BLACK ICE DEFENDER had all these (not so) cool features, the kind that semi-technical people love. For example, it would log ATTACKS. What are ATTACKS? Unrecognized traffic, of course.

Sometimes the little UDP datagram for our "you have mail" service would be delivered to a PC user running BLACK ICE DEFENDER, which would register it as an ATTACK. It would then ever so helpfully look up the ARIN contact information to see who sent the errant datagram -- which had the NOC phone number. It would then tell the user "THIS ENTITY IS HACKING YOU" and imply that contacting them would be productive. Yes, you could pick up a phone and call the Microsoft NOC. Back then, the internet was a smaller place.

My job was to check the NOC voicemail, which was reliably filled with very angry people. Often they would threaten that they've reported us to the FBI or somesuch, or that it confirmed some conspiracy theory or another. We played the good ones on speakerphone for entertainment.

Good times. Doesn't happen anymore.

  • Great story :)

    The same year I was working on various online music stores using Microsoft's Windows Media DRM. This would cause a licensing window to pop up in Media Player all the time when the license was missing or expired for someone's music. We would get various complaints emailed to us, and being head developer sometimes the thornier ones would end up in my inbox, and my friendly ass would be kind enough to reply and try to figure them out.

    One time one of the senior execs was walking past my screen and peered over my shoulder to read a new email which said "EVERY MORNING YOU ARE ON MY COMPUTER GET OFF MY COMPUTER!!!". The exec leaned over, typed "GO FUCK YOURSELF" and hit reply.

    The benefits of being the boss...

  • Microsoft ends up, wrongly, in the firing line for a bunch of things, similar to the story you told and to the sqlite_ comment in TFA.

    Notice that on iOS (for example) if an app crashes it just disappears > poof! < and it’s gone.

    On Windows when an app crashes (or you kill a non responsive app), you then see a dialog with Microsoft Windows branding saying that it is logging that the app crashed. As nerds we understand why… but the result is that the user doesn’t curse the app, they curse Microsoft. Whereas when an iOS app disappears they curse the app, not Apple.

    • macOS is a better comparison to Windows than iOS is, since iOS is just for iPhones and not desktops. macOS will show you the stack trace of a crashed app in a dialog. This allows more technical people a chance look into the reason for the crash. Users have the option to share the stack trace with Apple.

      4 replies →

  • The people who left those voicemails are now the ones leaving very odd and paranoid posts on GitHub issue trackers. There's a few unfortunate souls out there convinced that something called Lighthouse is trying to hack them.

  • Holy moley, my first “real” computer ran 98SE and had BLACK ICE, which behaved exactly the way you described.

    I remember I would take the IP addresses of the “attackers” and plug them into a McAfee graphical tracert tool, and it felt like nothing short of something out of GoldenEye.

    Thanks for the memories!

Similar story: Friend and I put out a kernel driver (uxstyle.sys) that would patch Microsoft's theming digital signature checks. It was free, buggy, and bugchecked the OS on upgrade. It was unsurprisingly added to compatibility blocks in Windows. I fixed the bug and asked Microsoft to loosen the block (version X and below). Microsoft refused citing a EULA violation. Valid or not, I renamed the driver to elytsxu.sys to circumvent their check and the app worked well enough until third-party theming fell out of favor.

I often search for weird files in my %userprofile% (there are a lot random ones) just out of curiosity, despite I know they're not malicious.

It doesn't help that if you Google any filenames, or even any semi-obscure file extensions, there would always be plenty of blogspam articles saying they're "possible virus". And oftentimes, there is no legit article to say what they really are even if you try, if they're from some relatively less popular software.

I don't know, the more time passes the more I convince myself that the net benefits of antivirus software do not (an maybe never) exceed their downsides. In decades I've heard so many stories about AV software behaving suspiciously, using borderline shady tricks to monitor user activity, causing severe performance degradation, etc

  • > (an maybe never) exceed their downsides

    There were certainly times when they were necessary: when Windows had nothing built-in to defend itself, and for a time after then when those built-in features were crap.

    Those times are pretty much over now IMO. I'd go as far as to suggest that the market is now an attempt at a protection racket and hardware hawkers are complicit: things come pre-installed on new laptops and make very misleading claims about what might happen if you uninstall them instead of subscribing after the free trial period (ref: Dad got a new laptop recently, I went through and removed all the junk included with it, I can see why people with little technical experience might just pay up).

    • > There were certainly times when they were necessary

      Around the time of Windows 7 I stopped using anti-virus software and nothing happened. And for a long time before that, paid antivirus software (ESET NOD-32) wasn't finding anything. I think the simple rules of having a router with a firewall, not clicking random files, not using Internet Explorer, and keeping Windows up to date covered 99% of possible exploits, and the other 1% was luck.

      2 replies →

    • The fact that a computer bought from a store, that you pay good money for, comes with this crapware installed shows that even when you do pay, sometimes you are still the product not the customer.

  • A couple years ago I heard a podcast that referenced a study that polled 2 groups of people; those that (admittedly, SELF) identified as "security professionals" and those that did not.

    The data point was "what is the biggest thing your group does that the other does not?"

    The professionals came in as "multi-factor auth" The non-professionals came in as "anti-virus"

  • There was a time when they made sense, but it's been many years since the benefits of running a third party antivirus program outweighed the drawbacks.

  • I remember our Java developers being very unhappy with ESET requirement that made the Linux boxes compile performance literally halve

  • Seemed to be a win-win for Windows and the AV companies. Windows would let cheap interns pump out code (autorun.inf) and anti-virus companies would throttle your computer looking for anything exploiting the garbage code. Windows Vista was like the high-water mark for a slapped together shit OS with its AV "protection" — "that place where the wave finally broke and rolled back".

  • There’s a deeper economic lesson in here. Essentially it could be summarised as: government cannot leave policing to the market.

    In modern capitalist democracies with social benefits there’s a lot of debate/opinion about what the state should do versus leave to the market.

    Imagine you establish a successful little utopia on an island somewhere. At first you outsource policing to a few strong and forthright friends, then as the place grows you let the market decide. But as the place really matures you end up having to bring policing “in house”, controlled by the state. There is still plenty (plenty!) of ways it can be corrupt… but when anti-virus / policing etc is fully outsourced - over time its corruption levels completely saturate.

    Similarly - a military power that relies on mercenaries/soldiers of fortune/contracted parties. It cannot work beyond a very short term. (Machiavelli goes into this in a lot of detail in ‘The Prince’ … he was super evil but he was no fool.)

    Seems strange to say that Bill Gates should’ve read more Machiavelli and he would’ve avoided creating the leeching anti-virus industry for so long.

Love it.

Too bad there aren't enough Mac users to prompt a similar backlash against Macs littering every computer they visit on the network with .DS_Store and other turds.

  • This isn't remotely comparable. Those .DS_Store files are created in arbitrary directories by the Apple file manager or something. The SQLite temp files are created in the OS-specific temporary directory (e.g. C:/Users/username/AppData/Local/Temp or whatever on Windows) which is specifically intended for that purpose. SQLite isn't doing anything wrong; that's where it's supposed to store temporary data that doesn't fit into memory.

    The problem is that virus scanners sometimes misclassify those temp files as belonging to malware apps, or sometimes they might be written by real malware apps, but even in the latter case, that only happens because the malware uses sqlite as a library. The malware isn't written by the SQLite authors, so complaining to them is pointless.

    • You misunderstood what the comment in the code was saying.

      Firstly, it not that anti-virus software was misclassifying it -- it's that a particular one, Mcafee, was USING sqlite. And more importantly, it didn't put the file(s) in the proper %temp% folder. Instead, it created it as `C:\temp\{name}.sqlite_`. This is why users found them suspicious and complained to devs.

      Also keep in mind putting random files or folders under root of C:\ was, while never recommended, a common practice in 2000s (some still do so even today). So what Mcafee was doing is hardly unheard of. But people still freak out when they see these in their C:\ (TBH, understandable).

      > %userprofile%

    • They're not arbitrary at all. They're in directories visited by Finder, storing the open/closed state of subdirectories, and apparently other view data.

      Apple, in its arrogance and backwardness, doesn't store those choices on the BROWSING user's computer; it stores them in the directories on the computer being browsed... where, by the way, they'll be trounced by the next Finder user who comes along.

      1 reply →

    • > Those .DS_Store files are created in arbitrary directories by the Apple file manager or something.

      .DS_Store files come from Apple's file systems containing a separate data and resource fork. APFS can natively store the contents, but for foreign file systems/network shares, a .DS_Store file is created to store those attributes.

      16 replies →

  • You can avoid creating .DS_Store files on network and USB volumes like so.

      defaults write com.apple.desktopservices DSDontWriteNetworkStores -bool true
      defaults write com.apple.desktopservices DSDontWriteUSBStores -bool true

  • .DS_Store has a origin story though not as entertaining https://www.arno.org/on-the-origins-of-ds-store

    • Apple's perennial lack of interest in truly fixing Finder shows in the fact that they've left this ridiculous bug in there for 20+ years, despite yet another "ground-up rewrite" that was supposed to (or did) happen at some point in the 2010s.

      My other favorite Apple filesystem cock-up is the inclusion of a "Contents" directory in every bundle, which never has any siblings. I mean... what kind of ass thinks that a directory needs a subdirectory to hold the contents of itself?

      The one thing Apple fixed in Finder that annoyed the living #$%! out of me for all previous years was that you couldn't sort files WITH FOLDERS AT THE TOP. Of course the fix was incomplete, broken in some places at first, including your Desktop directory. But that did get fixed surprisingly quickly.

  • That's the fault of whoever owns the other computers. If a user ever littered files all over my computer I would revoke write access until they fix their shit.

Why not to use sub-directories inside a temporary directory instead of file name prefixes?

The relevant snippet:

   /*
   ** Temporary files are named starting with this prefix followed by 16 random
   ** alphanumeric characters, and no file extension. They are stored in the
   ** OS's standard temporary file directory, and are deleted prior to exit.
   ** If sqlite is being embedded in another program, you may wish to change the
   ** prefix to reflect your program's name, so that if your program exits
   ** prematurely, old temporary files can be easily identified. This can be done
   ** using -DSQLITE_TEMP_FILE_PREFIX=myprefix_ on the compiler command line.
   **
   ** 2006-10-31:  The default prefix used to be "sqlite_".  But then
   ** Mcafee started using SQLite in their anti-virus product and it
   ** started putting files with the "sqlite" name in the c:/temp folder.
   ** This annoyed many windows users.  Those users would then do a 
   ** Google search for "sqlite", find the telephone numbers of the
   ** developers and call to wake them up at night and complain.
   ** For this reason, the default name prefix is changed to be "sqlite" 
   ** spelled backwards.  So the temp files are still identified, but
   ** anybody smart enough to figure out the code is also likely smart
   ** enough to know that calling the developer will not help get rid
   ** of the file.
   */

Looks like the line numbers were lost: https://github.com/mackyle/sqlite/blob/18cf47156abe94255ae14...

It's because McAfee started using SQLite, angry users would stumble upon the files, do a minimum of searching or thinking, and be furious at SQLite developers.

  • Daniel Stenberg has also gotten some ... interesting ... emails because people found the curl license somewhere and assumed he was responsible for $stuff.

    https://daniel.haxx.se/blog/2016/01/11/tales-from-my-inbox-p...

    https://daniel.haxx.se/blog/2016/01/19/subject-urgent-warnin...

    https://daniel.haxx.se/blog/2018/02/16/why-is-your-email-in-...

    https://daniel.haxx.se/blog/2021/02/19/i-will-slaughter-you/

    • I've always wondered if Daniel gets a particular amount of email of this sort because his address includes "haxx"--for someone without tech savvy, it's not all that much of a jump to assume an email that includes something sounding like hacking is the email of the hacker that has attacked them.

      1 reply →

  • I wonder why users were angry about some files in their temp folder. Did McAfee fail to cleanup those files, or were they too big?

    Edit: more information here: https://www2.sqlite.org/cvstrac/wiki?p=McafeeProblem Apparently, McAfee kept those files locked when it was using them, so the files couldn't be deleted and people got angry that they couldn't clean them up. Sounds like a loud minority to me.

    • There's some level of power user that will try to fix things, not understand what's going on, and yell at the world when they break things further.

      I used to have a popular freely available 3rd party DLL that was included in lots of software packages. Because I was silly, it had my email address in the metadata that'd show up if you clicked "Properties" in Explorer. I'd get plenty of emails from random people asking for and sometimes _demanding_ help with software I've never heard of. I'm sure if I had an easy to find website with my name and a forum on it, it'd be full of such angry comments.

      1 reply →

    • In Linux, these files are unlinked, so they are invisible in the filesystem. It is legal to unlink an active file descriptor, but continue reads/writes to it.

      I think that lsof can still see these temporary files; I'm not sure how I first noticed it.

      Windows implements a POSIX kernel layer, so perhaps this functionality could be coaxed out of it.

      7 replies →

    • > Sounds like a loud minority to me.

      Imagine for a moment the cross section of Windows users that 1) have McAfee running 2) poke around the Windows temp folders and/or run some sort of "cleaner" to "optimize" their system and 3) don't recognize the name "sqlite". I can imagine the exact type of user, peak Dunning-Kruger and utterly insufferable. Even just seeing their e-mails before deleting them would be maddening. I think most people on HN can vividly imagine the sort or remember back when they were that insufferable twat.

      3 replies →

Somewhat related - I’m very very curious to hear a detailed account of someone who uses SQLite for a production app with high traffic.

For the embedded use case I think it’s a slam dunk, but there are many interesting use cases for server side but they all seem to be toyish.

The locking behavior of SQLite is somewhat problematic unless you use WAL and even then not perfect

  • > there are many interesting use cases for server side but they all seem to be toyish.

    > The locking behavior of SQLite is somewhat problematic unless you use WAL and even then not perfect

    SQLite with WAL and synchronous configured appropriately will insert a row in ~50uS on NVMe hardware. This is completely serialized throughput (i.e. the highest isolation level available in SQL Server, et. al.). At scale, this can be reasonably described as "inserting a billion rows per day".

    I have yet to witness a database engine with latency that can touch SQLite. For some applications, this is a go/no-go difference. Why eschew the majesty of SQL because you can't wait for a network hop? Bring that engine in process to help solve your tricky data problems. I'd much rather write SQL than LINQ if I have to join more than 2 tables.

    We've been exclusively using SQLite in production for a long time. Big, multi-user databases. Looking at migrating to SQL Server Hyperscale, but not because SQLite is slow or causing any technical troubles. We want to consolidate our installs into one physical place so we can keep a better eye on them as we grow.

    Fun fact: SQL Server "Hyperscale" is capped at 100MB/s on its transaction log. I have written to SQLite databases at rates far exceeding this.

  • A super smart guy on my team at a previous job replaced around $100,000 of server and SAN hardware used for a(n) (admittedly incredibly absurdly designed, well before our time) analytics system built using MySQL filtered replication triggering stored procedures in this magical Rube Goldberg-ian dystopia with a 3 node Flask app running with 2 cores and 4 GB of RAM each performing the analytics work JIT. The app would use in-memory SQLite3 tables to perform the same work as the stored procedure operations, and cost about 50ms of extra time per request for a feature of the app that was rarely used.

    Admittedly, not high traffic like you asked, but one of my favorite uses of SQLite hands down.

  • Apps aren’t divided into “high-traffic” and “toys.” There are plenty of use cases where you have a low-write server in a production environment, and SQLite would work fine there.

    If you need high write volume, then yes, the locking behavior means SQLite is not a good fit.

    • SQLite can easily hit 15k INSERTs per minute or more (setting processor affinity to a single core helps drive the max rate up). However, if a process begins a transaction and then stalls, it halts all dml.

      I think performance can be good, as long as a competent schema design is in place. Allowing ad-hoc queries from less trusted users will surely tank performance.

  • There are plenty of use cases of SQLite being used in production servers, usually for cache locality of small datasets (think more complex read-focused Redis use cases) or for direct data operations (transforms, for instance).

    That being said; even if it weren't usable in a web service space, does that make it any less reasonable of a database? That whole mentality sounds like a web developer centric one. Berkeley DB was used for decades for application and system databases, a field that SQLite largely replaced it in. And one that MySQL, Postgres, Oracle, etc are generally completely unsuited for. It's the same reason Microsoft offered MS Access for so long alongside MSSQL, until MSSQL had it's own decent embeddable option to deprecate it.

  • I've been pushing to try SQLite as a "sacrificial memozation." Basically we have two tasks separated by 5-10 days. When we do the first task, we calculate a bunch of information as a "side effect". At the second task, we don't have that information and trying to reconstruct it without the original context is very slow, because a lot of it is dependent on temporal state-- what was happening 5-10 days ago.

    The other use case I'm eager to explore is as persistence of status data in a long-running process. Occasionally the task blows up halfway and although we can recover the functional changes, we lose the reporting data for the original run. If we save it in a SQLite database instead of just in-script data structures, we don't have to try to reverse engineer it anymore.

    In both cases, I like the idea of "everything's one file and we nuke it when we're done" rather than "deal with the hostile operations team to spin up MySQL infrastructure."

  • Yes, the sqlite defaults are quite terrible out of the box. I'm not sure why they never changed them, it will start choking at 5k inserts where other dbs can do 100x that (and so will sqlite in wal and a few other settings). Getting it to perform well in high traffic scenarios would be a lot of effort. I struggle to get it to be vaguely performant in embedded use cases and often roll my own poor man's version unless I really care about data integrity (which is rare).

    • > I'm not sure why they never changed them

      Because SQLite is run in many different environments and scenarios, and what's terrible in one scenario is perfect for another. There are no defaults that will work for everyone. This also applies to MySQL, PostgreSQL, etc. but the range of scenarios for those is more limited (no embedded for example) so the defaults are a bit more tuned to what's suitable for your scenario.

    • Isn't it more rare to not care about data integrity? Being unsure what state your data is at any point in time does not seem like a safe scenario.

  • High-volume single-writer + WAL mode is perfectly doable for a lot of applications, but you have to keep the concurrency model in mind. I think of it more like a really advanced data structure library than a "database" in that sense.

    An old product I worked on (which is still around) used (still uses?) SQLite for storing filesystem blocks that came from the paired backup software. A single database only contained the blocks backed up from a single disk (based on its disk UUID.) So, this is a perfect scenario since there's only one writer at a time because there's only one device and one backup on that device at a time. We could write blocks to SQLite fast enough to saturate the IOPS of the underlying device and filesystem containing the database. This was over 10 years ago. Very practical and durable and way better than anything we could do in house.

    Today, you can definitely do GB/s on NVMe class hardware with the right pragma settings and the right workload. So for certain classes of multi-tenant solutions I think it's not so bad, you can just naturally have a single writer to a single data store (or can otherwise linearize writes) and the raw performance will be excellent and more than enough.

This is a bad choice, I wouldn't understand that it needs to be read backwards. Why not use process's executable name instead?