Top
Best
New

Posted by bcye 7 hours ago

Big Data on the Cheapest MacBook(duckdb.org)
247 points | 222 commentspage 4
Jeffrin-dev 7 hours ago|
[flagged]
post-it 7 hours ago|
Did you use AI to write a one-paragraph comment?
SteveNuts 6 hours ago|||
> MacBook Neo's NVMe being slower than the Air/Pro isn't just a benchmark footnote — it compounds with file count in a non-linear way.

The “isn’t just” part is a dead giveaway almost always.

Schiendelman 7 hours ago||||
It's probably worse than that - he probably used AI to look for things to comment on and automatically create comments. He's not even here.
4ggr0 6 hours ago||||
it really is the stereotypical form of "it's not just this, it's actually that", and even an em-dash!
_joel 6 hours ago|||
Account was created 3 days ago, probably one of those bloody clawdbots.
opentokix 6 hours ago||
Mind blown, if you need to handle "big" data on the move - the macbook neo is not the right choice. - Who would have guessed that outcome?
g947o 6 hours ago|
It occurs to me that there is near zero overlap between people who use a Macbook Neo and people who run DuckDB locally.

It would be a surprise if more than 0.1% of Macbook Neo users have even heard of DuckDB.

Which means that this article is probably just riding the hype.

hrmtst93837 6 hours ago||
Trying DuckDB on lower-end Macbooks does show you dont need much muscle for moderate-size analytics. Long term it isnt cost-effective compared to budget laptops but its super simple for self-contained pipelines. The thing is 8GB RAM leaves you stuck once your data actually grows past the marketing demo.
NetMageSCW 4 hours ago|||
Can’t give up and admit that 8GB of RAM is enough, can you?
g947o 3 hours ago|||
I think you completely missed the point.

People buy Macbook Neo because they "just need a laptop" or are budget conscious.

I imagine a student would get their hands wet with Postgre before looking at DuckDB or similar.

It would be a surprise if they do heavy workloads with DuckDB. In which case it's definitely worth investing in a more powerful computer.

hermanzegerman 7 hours ago||
That's an awesome idea to get a bricked MacBook Neo really fast because those idiots soldered the SSD inside
windowsrookie 7 hours ago||
Apple has been soldering the SSD into MacBooks for over 10 years now, and most 10 year old MacBooks still have a working SSD.
hermanzegerman 5 hours ago||
Not if you're powerusing it like in the Article and relying heavily on Swap.

Also there are countless reports of bricked M1 8GB MacBook Airs that are bricked because the SSD used up it's write cycles

https://youtu.be/0qbrLiGY4Cg?si=mjKn2oLjqAb36hPU

havaloc 5 hours ago||
That's not what the video insinuates.
hermanzegerman 1 hour ago||
Yes you're right. I meaned a different video, but I can't find it right now. I've looked it up, and back then MacOS had a bug which exacerbated that issue. Here is an article

https://www.macrumors.com/2021/02/23/m1-mac-users-report-exc...

lachlan_gray 6 hours ago|||
Not sure about the ssd in particular but the neo is apparently pretty modular

https://www.youtube.com/watch?v=5k7Lv7f-5CQ

sam345 6 hours ago||
Fantastic tear down. Thank you. Amazing for Apple. I hope this is the trend going forward but probably not. But still a gazillion screws? I just replaced the keyboard for my old hp elitebook with two screws.
hermanzegerman 5 hours ago||
I don't care about a gazillion screws, if it's serviceable in the End.

If Apple would build their laptops serviceable like ThinkPads I would buy one today.

MBCook 3 hours ago||
It seems like they’re starting to learn the cost of being too integrated.

They’ve slowly been moving towards making it easier to repair individual broken parts. I’m very happy to see that a new keyboard doesn’t require replacing the entire top case. That was just crazy.

k4rnaj1k 7 hours ago||
[dead]
ramgale 6 hours ago||
Seems completely unnecessary, there is probably 0 overlap between people who buy a cheap MacBook and people running DuckDB locally
MBCook 3 hours ago||
I agree I don’t think it’s going to be something people really do.

I just thought it was neat. It’s a phone chip, we’ve never been able to do stuff like this on an Apple phone chip before. No one was porting this to the iPhone to run there.

In my mind this is purely a curiosity article, and I like that.

swiftcoder 5 hours ago|||
I've used MacBook Airs as primary dev machines multiple times in my career (before Apple silicon, when Airs had truly shit performance).

There is always a trade-off of cost/convenience/power, and some folks are going to end up the the Neo end of the spectrum.

ExxKA 6 hours ago|||
I love small form factors, and I am what youd call a professionel :P
leoedin 5 hours ago||
I think the form factor is basically the same (maybe slightly thicker) as a Macbook Air. It's basically an Air with lower performance in most dimensions.
dartharva 4 hours ago|||
You'd be surprised. There are many of us analysts in the third world who are paid pennies and expected to build large-scale exec dashboards from nontrivial data - with no cloud support whatsoever. ETL has to be local from hundreds of GBs of csv dumps.
NetMageSCW 4 hours ago||
It’s necessary because the ignorant keep saying 8GB of RAM is a deal breaking limitation on the cheapest MacBook available.
TutleCpt 7 hours ago||
Oh great, the term "big data" is back.
michalc 7 hours ago|
So my definition of big data was data so big it cannot be processed on a single machine in a reasonable amount of time.

I guess they’re using a different definition?

jawns 7 hours ago|||
I think it's partly tongue in cheek, because when "big data" was over hyped, everyone claimed they were working with big data, or tried to sell expensive solutions for working with big data, and some reasonable minds spoke up and pointed out that a standard laptop could process more "big data" than people thought.
rattray 7 hours ago||||
> For our first experiment, we used ClickBench, an analytical database benchmark. ClickBench has 43 queries that focus on aggregation and filtering operations. The operations run on a single wide table with 100M rows, which uses about 14 GB when serialized to Parquet and 75 GB when stored in CSV format.

very much so…

rrr_oh_man 6 hours ago||||
In my former life as a soulless consultant mid-level IT managers really liked to hear the 3 "V"s mentioned: Velocity, Volume, Variety
speedgoose 6 hours ago||
The V of Value is very important in some circles.
speedgoose 6 hours ago||||
Computers got bigger and software got smarter.

You have phones that are faster than cloud VMs of the past. You can use bare metal servers with up to 344 cores and 16TB of ram.

I used to share your definition too, but I now say that if it doesn’t open in Microsoft Excel, it’s big data.

Zambyte 6 hours ago||
Processing data that cannot be processed on a single machine is fundamentally a different problem than processing data that can be processed on a single machine. It's useful to have a term for that.

As you say, single machines can scale up incredibly far. That just means 16 TB datasets no longer demand big data solutions.

speedgoose 6 hours ago||
I get your point, but I don’t know if big data is the right term anymore.

Many people like to think they have big data, and you kinda have to agree with them if you want their money. At least in consulting.

Also you could go well beyond a 16TB dataset on a single machine. You assume that the whole uncompressed dataset has to fit in memory, but many workloads don’t need that.

How many people in the world have such big datasets to analyse within reasonable time?

Some people say extreme data.

bcye 7 hours ago||||
I think they are simply referring to analytical workloads.
brudgers 6 hours ago||||
“Your data isn’t big” is a good working definition of big data.

Google has big data. You are not google.

antonyh 4 hours ago||
I think the definition of big is smaller than that. Mine was "too big to fit on a maxed-out laptop", effectively >8TB. Our photo collection is bigger than that, it's not 'big data'.

Or one could define it as too big to fit on a single SSD/HDD, maybe >30TB. Still within the reach of a hobbyist, but too large to process in memory and needs special tools to work with. It doesn't have to be petabyte scale to need 'big data' tooling.

evanjrowley 5 hours ago|
>Can I expect good performance from the MacBook Neo with Slack, Microsoft Office, and Google Chrome signed into Atlassian and a CRM, all running simultaneously?

No.

>Do I reject a world where all of the above is necessary to realize value from an entry-level MacBook?

In theory, yes.