Posted by thunderbong 4 days ago
[1] https://github.com/openobserve/openobserve?tab=readme-ov-fil...
There are free and open source solutions like Keycloak and Zitadel. I don't dispute they are less common than Okta and Entra, but the definitely exist and are deployed in the real world. My workplace (state government) uses Keycloak for example.
Another thing that the article doesn't really touch is that SSO is locking a security best practice, important for an organization of any size, behind a paywall. With SSO, when someone leaves the organization you can disable their singular account and be confident they are locked out of your shared folders, gitlab, jira, etc, etc, rather than having to manually track down and disable each one, with a high likelihood of missing something. This is important for an organization of any size > 1 from a bootstrapped startup all the way to fortune 500. Hiding it behind higher cost makes it more likely that an org will try to do without and have a security breach as a result.
I also take issue with:
> Developing and maintaining SSO solutions requires significant investment in research, development, and infrastructure.
Having done it myself, this is overstated. No feature is free but implementing a SAML or OAuth flow is not THAT much work, nor does it represent a huge amount of ongoing maintenance.
I actually don't mind the SSO tax too much in cases where it's the differentiator between free or open source vs paid. I find it far more egregious when it's a product that already has a cost and SAML auth jacks up the price 2-10x. I don't think the blog post is a particularly good discussion of the tradeoffs though.
Ahh you want to make it easier to enable this in your org, in order to get better adoption and ensure the data in our app is more secure, yeah you're going to need to pay us for that.
| Vendors need to recoup costs
| Industry standards: The SSO tax has become an industry standard
Well in that case fine /s
1 - is OpenObserve providing SSO security for ALL your applications, no. Is it doing SCIM, Identity Governance, provisioning, no... its like saying you pay for a sandwich, why dont you pay for the door you used to come into the shop as well. Door tax.
I bet they don't charge you to recoup costs on implementing a JS library? Why are they 'recouping costs' on adding support for OIDC/SAML standards. Build your solution to support SAML/SCIM and OAuth, allow anyone to consume it.
Why?
Adoption and security. Anyone who's a Google Workspace or Microsoft shop has an IDP (albeit basic but OK). Most orgs see the IDP capability there as free. They are then seeing the ability to leverage it as a paid offering in the SaaS apps they buy. So on the one hand, the Identity Provider is free, but the SSO endpoint on the app is paid? Wild.
Also, this is wild:
| For our cloud service we provide SSO in our free tier for following providers with plan to support more in future: Google,GitHub,GitLab,Microsoft
This is great, well done.
| SAML and OIDC are available in our enterprise tier.
WTF? The built out integrations that you had to make UI elements for, offer free (that vendor recoup argument died here). The ones that are generic, are paid for. Ahhh thats right, the generic ones are the ones that let you use Okta, Ping, OneLogin, Keycloak etc etc. Got it, the "valuable" ones.
I'm really starting to get sick of companies that claim they operate at petabyte at scale and find you need to spend 400k a month to support that scale.
How many open source log systems work at PB scale given any number of resources? Also FWIW, OpenObserve can ingest data at 28 MB/Sec/Core (We are working on optimizing it even more) and ingesting 1 PB of data would cost just $435 based on on-demand prices (AWS m7g family).
Is that production rate, inbound bandwidth, rate to persistence, rate to processed, or rate to display?
It's not "only 28 MB/Sec/Core". Try doing same with Splunk/Elasticsearch - You won't go past 5 MB/Sec/Core (Typically it will be lower) on their best day.
Suppose I have 28 GB of trace data in memory on a machine and then I fire that off. What do I have after 1000 seconds?
Do I just have a file of 28 GB of raw trace?
Do I have 28 GB of raw trace in memory ready to be indexed?
Do I have a data structure in memory ready to be searched?
Do I have the full trace information rendered on my screen (or a aggregated visualization derived after processing all the data)?
If it is the first, that would be ridiculously slow. If it is one of the latter ones, then it would depend on what querying operations are fast.
28 MB/core-second makes no sense without the context of what you can do quickly after the “processing” is done.
We have spent over 2 years building OpenObserve into a simple, highly usable and efficient observability tool. You could run it using a single binary that provides all the functionality of logs, metrics, traces, front end monitoring, dashboards (18 different chart types), alerts and pipelines.
OpenObserve is being used by startups, mid tier enterprises and fortune 100 companies. There are thousands of active installations of OpenObserve globally.
Folks have replaced Elasticsearch, Splunk, Graylog, Datadog , Newrelic and more for OpenObserve.
Comment from a user -
We moved from 5 node OpenSearch cluster to single node OpenObserve and measured using our actual everyday queries, which are reasonably complex queries (1 to 5 conditions applied) over our real logging data. We see that typically they complete in about the same time. OpenObserve costs us 10 times less though (instances + storage)
Also, we are currently working on replacing one of the world's largest splunk installations.
p.s. I am one of the maintainers of OpenObserve. Feel free to ask questions. I will be happy to answer them. You can also visit our slack workspace at https://short.openobserve.ai/community for discussions.
I know personally of several companies which use this stack with tens of thousands of nodes. Everything runs on object storage which means it is easy to scale up, and you can move between storage providers as long as they implement the S3 API. The grafana ecosystem is very sticky as well. If I download some random helm chart it most likely will come with a grafana dashboard which I can easily import and instantly have dashboards and alerts for the helm chart.
Not very convincing why I wouldn’t go for the LGTM stack which has been proven to be effective.
For those looking at much more simplicity, and much higher performance OpenObserve is the way to go. Many folks have moved from Loki to OpenObserve due to performance issues with Loki. Many have moved from LGTM stack completely to OpenObserve. Many have chosen to use Grafana as a front end for OpenObserve too.
Take a look at how easy it can be to build dashboards in OpenObserve.
It takes time for community and ecosystem to build for great products. Grafana started in 2014. OpenObserve started in 2022.
Show me OpenObserve ingesting a few billion metrics streams and show me how query times are faster than mimir.
We run benchmarks internally and will publish them once we are ready for it, but are unlikely to benchmark someone else's product. Benchmarking someone else's product always leads to a conversation similar to - You ran your benchmark in the most optimized way but used our non optimized settings or a version of that.
People who use OpenObserve like it for it's ease of setup, ease of management, high performance and rich feature set.
Especially for logs, grafana and loki is no match in terms of features and performance when it comes to OpenObserve. I will let you test it if you are curious and have some spare time.
OpenObserve is used by people ingesting MBs of data in their basement to PBs of data in large clusters in AWS, Azure, GCP and other cloud environments.
BTW, here is a story of a large EV company who moved to OpenObserve for traces and increased performance by a factor of 10x and reduced their cost at the same time - https://openobserve.ai/blog/jidu-journey-to-100-tracing-fide...
Like I said, the industry is small and I know several colleagues running LGTM at large scales (10k-100k nodes) so I know the system works and it is a safe bet.
https://github.com/openobserve/openobserve/blob/v0.12.1/LICE...
OpenObserve clusters can ingest PBs of data every day. While more can be discussed - I would rather focus on what are your needs when it comes to observability. Let's talk about them. I will be more excited to answer those questions.
OO integration may save me a couple of days of setup, but long-term it's a dangerously limiting idea / lock-in.