Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Stop thinking of RAM/disk/etc as storage systems and start thinking of them as retrieval systems. Then stop thinking of your data costs as $/GB (storage systems) and start thinking of your data costs as $/(IO/sec/GB) (retrieval systems).

I know everyone these days seems to think that removing a structured query language parser from a database makes every other problem go away, but realistically RDBMS vendors spend millions of dollars trying to fix this exact problem. It's called cache invalidation and it's a hard problem to solve in a general way.

SSDs are just a midpoint in the performance trade-off game.

The OS is the worst at this, DBs are somewhat better, but realistically if you want serious performance out of your application you need to make those choices for yourself, and use all strategies where appropriate RAM for records you need instantly (memcache,redis,mongodb) SSDs for the stuff you can't afford to keep on an SSD And hard drives for stuff you can't afford to keep on an SSD.

What you need to think about is the value of your data in dollars per IO/sec per DB ($/(IO/sec/GB), if the amortized value of that data exceeds the amortized cost of the retrieval system then buy it. Focus on increasing the value of your data, not reducing the costs of it's retrieval as that will drop by 1/2 every 18 months anyway. Alternatively, change your business model so you are going short on IO/sec/GB, (eg. pre-sell storage so that when you need to buy it you can do so cheaply)

What I'm trying to say is that the value of a picture is worth more to Flickr than it is to Facebook, thus Flickr will have an easier time building it's retrieval systems than Facebook because of the costs involved. That's why Facebook had to write their own filesystem for retrieving pictures.

I'd bet that any commercial DB will blow rings around redis/mongo/etc if you had your persistent store as a RAM disk and used hard drives for the transaction log. The cost of a SQL Server license is negligible if you're going to buy a server with $200,000 worth of RAM in it. If your data is valuable enough you could just keep everything in SRAM (L1/L2 cache) and buy processors just for the cache.



This has been well understood since antiquity (in the CS world at least). Read Jim Gray's "The 5 minute rule" and the more recent papers that cite it. Most likely the access frequency of your objects is not such to demand them residing in L2.

Ultimately there's no need to use a commercial database either, as there are compelling open source alternatives, though if your needs are very specific, a commercial database may be your best tool.


> This has been well understood since antiquity (in the CS world at least).

Yes I believe it was first Cicero that pointed this out. Or perhaps even Aristotle. ;)


> SSDs for the stuff you can't afford to keep on an SSD

I think you meant 'in RAM' there.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: