T O P

  • By -

D-2-The-Ave

Dagster prefect and airflow will most likely still be around, and surely used by some companies 10 years from now. Will there be something better by then? Probably, but later you can make the call to transition if you need it, but maybe what you have is sufficient even 10 years down the line


robberviet

You can bet on dagster if you feel like it, changing orchestrator is not that hard compared to other things like compute engine, databases.


Kooky_Quiet3247

You are a lucky man my friend. I've worked for several companies that had a large part of their processes written directly in the airflow dags, changing orchestrators meant changing thousands of files But yeah, in the ideal world changing orchestrators should be easy


bzimbelman

While nothing is guaranteed, dagster will most likely still be around in 10 years as will airflow and prefect. Hell, COBOL is still around and that is what, 60+ years later? So the real question is more around budget, support requirements and learning curve. Don't assume that you will use any tool out the box and it will behave 100% as you want/need it. What are the warts? What are the costs including hidden costs? What will you have to do to make the tool meet your needs. So, by all means go with dagster if that is heads and shoulders the better choice for your situation. But do understand that you will likely need to make some significant tradeoffs to make it meet all your needs and be prepared to either develop those tools/solutions or give in on some requirements to make the tool meet your needs. And this isn't a knock on dagster, it is a fine tool I'm sure. It's just that no tool is without it's warts.


fatgoat76

It’s extremely difficult to see beyond the next 2 years in technology, let alone 10. It sounds like you should go with Dagster. Enough will likely change about your project over the next 10 years where the orchestrator will be the least of your concerns. Good luck.


tanlda

Inject your dollars to dagster+ to keep them aroundđź—ż


Tumbleweed-Afraid

Also it has the hybrid deployment option as well


DoNotFeedTheSnakes

I think Airflow is great. Empowering UI, great community and decent documentation. The code definitely has a batteries included approach. I'm not familiar with other solutions, but I've heard good things about Dagster as well.


anatomy_of_an_eraser

Do you want stability now or in 10 years? Decide based on that Expensive rewrites are unfortunately part of this because we are still very much in the discovery phase of orchestration. There is no 1 way to orchestrate to rule them all. Even between airflow 1 and 2 there are expensive rewrites. Prefect 1, 2 and now 3 all have breaking changes that require code to be rewritten, deployments to be updated and permissions to be granted.


luquoo

Yeah, I got burned hard by the prefect 1 -> 2 shift. 


anatomy_of_an_eraser

Tell me about it. They introduced so many concepts like storage, infrastructure, agents etc and have now decided to scrap all of that in Prefect 3.0. I hate that company with a passion…


luquoo

What got me was I was trying to define dags and stuff using python. And that function was either super buggy or non-functional. So instead of it being a quick and easy port, I had to spend days troubleshooting, got to the point where even a Prefect solutions person couldn't help me, finally realized the cli way worked and had to use that... I still think Prefect is better than Airflow, but you probably should run your own scheduling server than use their cloud stuff, cause then you can update on your own terms rather than being forced. Or just use cron unless you need something that scales hard.


anatomy_of_an_eraser

I know what you mean because I remember running into so many weird quirks of the prefect way (TM). Their support was, to put it mildly, completely utter fucking useless. Definitely see the appeal in self hosting it considering the learning curve is easier than airflow to get started. But I don’t think it’s stable enough for me to recommend it to any organization that wants a serious orchestrator.


Financial_Anything43

Airflow


drunk_goat

I think it's a good pattern to keep your DAGs simple in case you need to migrate them to new platform.


startup_biz_36

I’m about to make my own. I despise these semi-open source projects because it’s a a matter of time before they put certain features behind paywalls. 


EatLessClimbMore

We adopted dagster a year and a half ago at my company and I really enjoy it, but the pricing went up a lot and is forcing us to refactor a lot of our code. We're ditching dagster+ because they seem too unreliable with pricing


lordirah

As long as you use the orchestrator tool for orchestration it should be fine with anything you have in the list


Justbehind

We wrote one ourselves. Orchestration is not all that complex if you know a little about queues and CRON. The most futureproof solution is often one that limit external dependencies. Even if you need to upskill a bit internally.


connerfitzgerald

If stability in the long term is a/the primary concern I would suggest Airflow. The [Lindy Effect](https://en.wikipedia.org/wiki/Lindy_effect) says that things have been around for X will on average be around for 2X Airflow as tool thou.....


engineer_of-sorts

I'm biased as we're building a platform ([Orchestra](https://getorchestra.io)) that does this, but I would say the more crucial thing is to ensure that you're using a modular architecture. Why do you care about it being in Python? Shouldn't the python run where it needs to run and have the orchestration tool trigger *the infrastructure* (i.e. like the Kubernetes Pod Operator in Airflow)? This would be the most bulletproof way for you to build a project that lasts for 10 years; by having a modular architecture and not coupling your actual code to the OSS orchestration framework at all if the asset stuff is a priority tying yourself to Dagster is actually risk-on IMO because the way in which they *generate the data you need to analyse data assets* is highly proprietary and not recreated *at all* in other OSS frameworks (Airflow have the concepts of Datasets, but this is very new and has not been that well-received so far).


britishbanana

Nothing about the asset feature in Dagster is proprietary, that's completely OSS. Just because other frameworks haven't implemented the concept doesn't mean it's proprietary.


engineer_of-sorts

It's OSS so you can, theoretically reproduce the logic that produces the assets under the hood in a framework of your choice, of course you can. So no it's not proprietary. But I guess if we get to the meat of it, (and I could be totally wrong here, let me know) but it sounds like one of your drivers for adopting on OSS framework that's durable (10 year horizon) is you *don't* want to be in a situation where you're locked into something that doesn't do the job you want it to. In that event, you would need to port that asset-generation logic out of whatever system you use into whatever you decide is best, which is not straightforward, even if that code is all freely available. DYOR but from what I've seen speaking to many companies using frameworks like Airflow 5 years ago is they're now in a similar situation where maintaining it requires quite a lot of resources at scale.


britishbanana

The same is gonna be true of any framework. Separate business logic from framework implementation, basic dependency inversion stuff, and it's really no bigger of a deal to migrate from Dagster's asset concept than it is to migrate from Airflow's Operator concept. I really don't see the difference, buying into any framework is going to require technical debt, that's just how it is.


engineer_of-sorts

sure the more you buy in the bigger the technical debt you have to make a trade off i won't belabour this point but you would also assess the risk of the open-source project losing its funding like with dbt What's with the 10 year time horizon anyway?


britishbanana

Yeah agreed, 10 year timeframe is quite the long view when most software needs a rewrite every 3-5 years at best. Systems that last longer are usually dumpsterfires  on life support


engineer_of-sorts

haha


yoquierodata

Curious if anyone has deployed mage.ai ??


khaili109

Apparently they were buying fake GitHub stars. I also never hear of anyone using them.


unfair_pandah

Apparently that was a rumor started by a Dagster blog post