JavaProspect: Database Concerns

Wednesday, May 29, 2019

Database Concerns

To learn about the current and future state of databases, we spoke with and received insights from 19 IT professionals. We asked, "What are your biggest concerns regarding databases today?" Here’s what they shared with us:

Security

Security of data at rest is still a big issue. Look at breaches, Equifax, accessed the network, found a database with credentials for other databases from which they were able to egress data.
We still have many companies that pay very little to database security. Sloppy database management and procedures often show its ugly head through a data breach or cyber hack. Such errors cost all of us a lot of pain.
If you ask a CIO/CTO, data security is number 1 even though many people consider it an afterthought. Databases are the hidden workhorses of many companies’ IT systems, storing critical public and private data. Lately, there has been a high-profile focus on data security. With the advent of GDPR and other data related regulations, security is paramount.

Complexity

Complexity and the sheer number of databases that are out there. So many options to consider hard to understand the subtle differences between each type.
There are a plethora of databases available today, many of them open source, and organizations find it very hard to distinguish between them and determine which is best for their workload.
Very complex and evolving. Particularly the open source space. The number of platforms and how to determine what’s relevant and what’s not is very challenging for SMEs.

Scale

1) Current concerns around scale and data being siloed. Slow database performance creates siloed data. Get updates and inserts processed differently kills innovation and agility. 2) Over promise, underdeliver from the vendors. Unable to meet the problems of scale. Can’t handle 20 concurrent users because of in-memory structure problems. Every vendor should ship a utility that helps customers prove your claims and capabilities. Package that in the product and ship for the customer to bring their own data run utilities during POC and prove capabilities.
Companies are trying to build applications that are reliable/always-on/ cannot have downtime. In practice, this has many organizations ending up with clustered systems as their database deployment and nodes scale — which in turn increases management overhead. If you don’t do things right from the start, your scale gets that much more challenging later on. Planning is critical to database scale success.
1) Latency spikes with mixed workloads at scale. Cassandra handles ingest, but if you do with read and analytics, it spikes. Ingestion is becoming real-time it’s not batch anymore. Almost all solutions have latency spikes. Look at benchmarks. 2) Scale — systems have limits, or it becomes impossible to manage at scale because it’s so complicated. The need for manual scaling and scale limits at petabytes and billions of requests is a big problem. 3) Data loss — can I trust the new transactional data solutions to take care of my data? 4) Silos — multiple apps on non-integrated applications without spinning up on different database instances. 5) Security — mission critical system of record idea if you don’t have security all of the time it’s a non-starter.

Other

Cost is always the biggest concern.
My biggest concern is when people adopt the wrong database or technology for a given problem or dataset. For example, using NoSQL to store huge volumes of highly-structured data. Just picking the wrong tool for the job. May work for the initial use case it may not scale, be usable.
Not necessarily a concern more continue to see the trend of people educating themselves on the right tool for the problem they’re looking to solve.
Some vendors like MongoDB have a big presence, they’re open source, last year leading a digital transformation effort adopted MongoDB could not say why it was adopted. The decision was just handed down. The commercial entity was not open enough about what was available out of the box. Not fully transparent in what you get in the open source and paid version.
A big concern is addressing whether or not the data can be consumed. Sometimes before even doing the first step of filtering (and especially so with data that needs to be set in context with other data points), it must be stored. After that, you need to identify the right data by having a flexible database technology that also allows the user to explore the depths of the data.
We see three big opportunities. The first is offloading more database operations to fully automated and managed cloud services that allow databases to intelligently scale, self-heal, patch and optimize performance. Secondly, the ability to manage data wherever it is created and stored--at the edge of the network in IoT and mobile devices — through to the backend using a single data platform that can seamlessly sync data across every environment. Third, as concerns of enterprise lock-in grow, there is also the potential for a data layer to act as the connection between hyper-scale cloud providers, allowing customers to move between different platforms quickly to exploit new features and provide them freedom in where best to run their apps.
The excitement in the new world of how data plays into the business. GPU acceleration allows you to look up much more data than ever before driving innovation.

Here are the contributors of insight, knowledge, and experience:

from DZone.com Feed http://bit.ly/2EDL7Pa

JavaProspect