T O P

  • By -

jrheard

oh my god no


LloydAtkinson

I really don’t get why OP seems so unsure of this haha. This is a legitimately awful “practice” and the fact that apparently this is normalised speaks volumes. Incompetence to a new degree.


nijuashi

I later privately confided to the manager about this practice, even asked if there was an attempt to fix it. I was a new guy there, but still hired at senior level. She looked at me as if I was the idiot and flat out rejected the notion that there was anything wrong with it. After seeing all the fires and putting out a few big ones, I figured there are better places to practice my trade and skipped out. I just wanted to make sure that I wasn’t the one who was the inexperienced outsider one who didn’t ”get it”.


BeenThere11

You were not.


Goingone

Infosec people dying reading this


CooperNettees

lmao


notmyrealfarkhandle

Use something like proto that is backwards and forwards compatible to serialize data instead of pickling. I remember an outage caused by someone storing pickled objects in redis and reading them out - and it blew up when a different version did the writing and the reading, mid deploy.


anemisto

While it's true that proto is forgiving when it comes to reading things written with a different version, people fuck up the compatibility rules all the damn time and make backwards incompatible changes, so it's hardly going to stop the outage you mention (to be fair, there's tooling to stop you from making bad changes... but that hasn't stopped me from witnessing more than one issue caused by people fucking up proto. Don't even get me started on the zero values.)


notmyrealfarkhandle

You’re definitely right, but in the particular case I was talking about, any decoding that didn’t blow up with an unexpected key would’ve been ok. Proto doesn’t fix things by themselves but proto plus best practices around changes helps a lot.


edgmnt_net

Unless I'm terribly mistaken somehow, I find proto quite deceiving and overpromising due to some weird choices they made. It looks like it allows schema changes, but in reality it just pushes complexity into code if you consider things like required versus optional fields. I haven't been using it much, but the few times I looked into it, things seemed crazy in there. Yeah, of course not being able to add required fields keeps the schema open to fiddling, but your code is gonna need to deal with missing fields or consumers that happen to silently ignore extra stuff you just put in.


ncmentis

Having worked with avro and JSON schemas, the same issues apply, so I don't think this is a proto issue. The main issue is that devs working with the schemas don't give much of a care for future compatibility and just start shoving fields in willy nilly. We have some basic rules to validate the schemas during builds but of course there's no money in doing things the right way...


edgmnt_net

I'm not very sure about those, but Protobuf 3 specifically removed the ability to mark fields as required because they think it hinders schema evolution. In effect, they're all optional. Which means consumers must expect any field to be missing and deal with it somehow, unless presence is enforced out of the spec. They also seem to think this positively helps schema evolution somehow and it's not just part of a mitigation strategy to allow parsers to be more flexible in case it's needed. Obviously you can run into serious compatibility issues with Avro / JSON Schema too, but I don't recall them pushing weird stuff like that. Which seems quite harmful if it leads people to think it'll magically solve API design for them. I think I'd rather take releasing v2 of an endpoint, it's just more honest.


thehardsphere

>What do you think one should do to improve the system? Discontinue all use of pickle. Replace it with any other method of serialization. It is trivially easy to craft malicious pickled objects that execute arbitrary code during unpickling. You application has a gaping security hole as anyone can feed it any given python class. If your "workflow configuration" is really just configuration, then use any configuration file format instead of pickling the python class that contains configuration variables. If "workflow configuration" includes more than configuration (like say, scripting), then I think you're going to need to think carefully about what requirements the user actually has and how to meet them safely.


AchillesDev

>It is trivially easy to craft malicious pickled objects that execute arbitrary code during unpickling. You application has a gaping security hole as anyone can feed it any given python class. In many cases, if someone has access to where you're storing your pickled objects, you have much bigger problems on your hands.


potatersobrien

Yeah it sounds asinine. If you have to convert it to text to modify it, maybe store it as text to begin with. You cant load those files without having all the classes loaded. Rename a class and now you can’t load the file. Smalltalk applications are deployed by serializing a live, running environment. In fact the smalltalk runtime environment IS the IDE. If you have a problem in production, you inspect the system state using the IDE and can make changes, edit code, etc.


VStrideUltimate

I agree, text is king in so many ways. Leave binary for the robots.


yolobastard1337

A lot of (justified) hate for pickle here. But... can we also hate on the "manual editing" bit? Someone clearly did some brilliant heroics to hack at stuff in prod to resolve some incident. However somehow these heroics became a process! Ideally, the system should have been fixed such that unblocking wasn't needed at all, but even if it's more complicated than that -- some tooling should have been implemented to simplify these hacks (and this tooling should have hidden the details of pickle). To slightly answer OPs question: it's quite common for teams to be following procedures that make no sense -- even more common in ops where the people running the system are distant from the people that implemented it. I usually blame ops managers for playing dumb corporate games, and wanting big teams.


notchatgptipromise

Maybe I am missing some context but that does sound bizarre. I'd think such a system would have a backfill functionality that would allow you to not touch the pickled object manually and just launch the workflow again to accomplish the same thing. Sort of reminds me of a workflow in ML where you train a model, then pickle it to send it elsewhere (a cache) to be called online. If the workflow fails, I'd never de-serialize, fix, re-serialize, then send off again. I'd find out why, re-run the workflow so the pickled model is valid (potentially overwriting the bunk object), and then be on my way.


await_yesterday

Completely bonkers. Probably whoever came up with this either didn't know how to serialize/deserialize the application objects, or couldn't be bothered. Maybe some junior cluelessly googled "how to save python object" and went with [the first result](https://stackoverflow.com/a/4529901).


F1B3R0PT1C

There are like a billion different ways to pass data around and store it, even for config. This is not a good choice.


Friendly-Pair-9267

I once had to maintain a data science project that had an ML model dumped to a pickle file. While upgrading the project's dependencies, the model stopped working due to an import path changing, causing import errors during `pickle.load`. My options at the time were (a) try to figure out how to rebuild/retrain the model from scratch or (b) manually edit the binary pickle file to fix the import paths. I obviously went with B and it worked fine after, but making that a common practice is actually insane IMHO. Pickle has clear practical and technical issues, and that's not even considering the security issues that have been brought up in this thread.


FoeHammer99099

What change could you possibly need to make that you can't do in code? Having some utility that deserializes the object, changes it, then serializes it again wouldn't be great but it would be a million times better than this.


lordnacho666

Why is everybody criticising this design? In terms of security, reliability, and general sanity, it might not look great. But those are not the only design parameters. Imagine if you made a system that any old developer could just look at, and all the configurations were just some JSON or yaml written with English names for the variables. Maybe it's even documented so that ambiguities are reduced. And it uses secure practices so that you don't get hacked. Do you really want to share your job with potentially millions of intelligent, sane, English speaking developers from around the world?


ShouldHaveBeenASpy

https://docs.python.org/3/library/pickle.html Look at the big huge call out box above the fold. That's why. Separate from that, the design doesn't make any sense because it neither keeps the system up, nor makes it maintainable without performing high risk for error operations. Without seeing more knowledge about the process itself it's hard to say more, but clearly what's being done is dangerously bad.


Steinrikur

I was about to ask why use pickle for this and not something like JSON. The docs seem to be recommending to rather use JSON in OPs case...


ShouldHaveBeenASpy

Best guess? The people who coded this think they're doing something special/unique because they think their use case is unique. I bet you it's not and that there's no reason to think of this configuration as any other than some ENV variables, or some JSon config. But before you propose a solution, it's critical to know what problem they're solving for: that's not super clear here. How people talk about the engineering problems they're solving for is as important as the implemented solution in many cases.


lordnacho666

Ya don't say?


misingnoglic

Oh god please no