Site24x7 Product Roadmap for 2023

05-Jan-2023 02:27 PM

As we complete 15 years of existence and step into our sixteenth, we want to thank every customer who has been with us in this tremendous journey. We had a humble start as a website monitoring product, but now we have grown to become a true-blue full stack monitoring solution, covering all the layers of the IT stack.

2022 was a solid year for us, and our customers. We enhanced our product and organically built features to cater to our ever-growing customer-base (which now includes enterprises!). As organizations continued to consolidate their monitoring tools, we evolved to establish ourselves as a pioneer in providing a full-stack solution—with an exhaustive feature set—that delivers a unified user experience for DevOps and SREs alike. On the way, we faced unique challenges in terms of search, and we are delighted to convey that we have fulfilled most of what we set out to do.

As we step into 2023, our focus will be to strengthen the unified experience that we've been offering. The roadmap is centered on the theme of contextual knowledge, integration, and unification of monitoring data across all layers in your IT stack. We have ensured that our work integrates seamlessly with the whole DevOps and operations ecosystem.

Event correlation and root cause analysis

The complexity of IT systems and the ongoing exponential growth of telemetry data are driving the need to improve operational insights with AI/ML. With all the layers in the IT stack, finding the root cause of an incident can be a pain.

Going into 2023, our primary focus will be to give contextual recommendations from APM, infrastructure, logs, and end-user experience to troubleshoot problems and analyze important events. All in an effort to help in finding the root cause of problems, and ultimately reducing the MTTR. We will provide more updates regarding this in our community posts.

Contextual integration

DevOps, developers, system engineers, network engineers, DBAs, operations engineers, SREs, webmasters, business owners, and FinOps professionals; they all play specific roles with important KPIs to look at. Often, teams cut across these personas and wear multiple hats.

However, seeing all the relevant data for each of these roles together contextually can still be a challenge. To address this challenge, we are making efforts to show contextual monitoring and log data in places where it makes sense. Are you missing any key information that you feel is needed to triage a problem? Comment below.

Platform features

In 2023, we plan on implementing a frequently-asked feature, which will enable granular, role-based access to various sections of the product. Other enhancements include addressing feature gaps in audit and alert logs, improvements to the default dashboards and creation experience, reporting enhancements, and more. Last year we also launched Terraform support for the product, which we will continue to enhance.

Infrastructure with deep insights

One specific focus area which has come into demand is deeper insights into databases. We will bring out database insights that will help DBAs and application owners to troubleshoot query issues and performance of DB KPIs—like top executing queries, locks stats, session, and more—in detail. This will extend into query explain capabilities in the future. Our initial release will be for SQL Server and MySQL databases.

Log archiving for longer retention, parser support for custom log types, log plugins for popular log frameworks, log processors, and remappers are important features to look forward to in Applogs. Deeper contextual integration with APM traces will be available this year.

Cloud is here to stay. We as a product have grown to support services in the cloud, be it in AWS, Azure, or GCP. We will continue to put our efforts in strengthening cloud monitoring, supporting various discovery methods, and enhancing our Guidance report portfolio. RCA across services with contextual data, along with event ingestion will be our focus this year.

To scale up for enterprise environments, we are coming up with a feature called on-premise poller group. This feature will have a customized load balancer that auto-detects capacity of the pollers in the groups and distributes the load evenly. This will have built-in AI-based outlier capabilities to detect the capacity of pollers.

This year we will continue to focus on adding new plugin integrations including Wordpress, Litespeed servers, IBM's AS 400 and AIX server, Commonvault backup, Akamai and CloudFlare CDN, Chef, Puppet, Ansible, and many more. An important strategy we will adopt this year is having dedicated implementation support for applications that need support but don't offer it out of the box. Do you have an application which needs to be monitored? You can raise your request here. Our team will get in touch with you and help.

We will develop specific to the network module include a firmware vulnerability report in NCM, adding more default templates, and coverage for SAN and SD WAN. Topology and a real-time world map are some of the visualization enhancements that the team will work on and release during the year.

APM

As applications become complex with micro services interacting with each other at scale, troubleshooting a particular trace is essential. With current sampling methodology we capture the problematic traces, however customers have requested to be allowed to store every trace, exception, and component. This year our engineering focus will be to remove capping and provide every detail so that critical troubleshooting traces are not missed. This will also help us in narrowing down root causes with better event correlation.

If your JVM sits on a small VM with restrictive memory, your microservice needs to be efficient to avoid leaking memory. The coming year you'll see a memory profiler being introduced in Java (to start with), which can detect memory leaks. This will help development teams to ensure the code is optimally written for memory usage, providing opportunities to investigate incidents where the infrastructure team can point to a potential memory issue in code for the application team.

Another advanced troubleshooting feature which we will likely introduce this year is thread-dump analysis, which will give detailed analysis of a thread dump.

One Agent for Site24x7

DevOps and operations engineers need a better way to manage, when the infrastructure capacity is fully utilised, apps become slow or when there is a bug in the application that burden the infrastructure. For this, system KPIs, along with application metrics, play a vital role to identify the root cause and help in reducing the MTTR.

This year, our team will unify the agent installation and management experience. With the release, engineers can be free of the hassle of maintaining different agents for APM and infrastructure in the product, which the unified agent will take care of.

With one agent, Automatic Discovery and Dependency Mapping (ADDM) will become much easier and will help create the relationships automatically across the entire IT infrastructure. This will allow you to include the agents during your CI/CD pipelines. It will in turn help us in providing accurate event correlations and RCA.

More updates in this regard soon.

End user experience

As we mature in our end user monitoring capabilities, customer feedback and our own experience have motivated us to integrate synthetic and real-time application monitoring. This includes the integration of APM data and logs with REST API and transaction monitoring (synthetic). Similarly, we plan to develop reports with contextual KPIs for RUM and synthetic together.

Waterfall chart for RUM, session recording, browser console log collection, alerting for key KPIs like JavaScript errors, and webpage custom constraints are some of the key updates that you can expect in 2023 for RUM.

As with plugins, we will have a dedicated service team that will help in troubleshooting the recorded synthetic transaction steps that have errors. The goal will be to help any kind of unique configuration errors that may occur during a recording and help them get resolved.

MSP

Our MSP edition will have renewed focus in 2023. To ease on-premise poller management, we will support on-premise poller for MSP. Other important updates we expect to roll out include better user experience for MSP admins to onboard new customers, the propagation of admin settings to customers, and the ability to generate license reports across customers with easy navigation between customer accounts within the same session.

The solid foundation we have laid for this product has empowered us to go deeper into each monitoring layer and spread our wings to cover allied segments to cater to emerging business cases. We are confident that our product’s unified approach to monitoring will enable you to achieve all your observability and monitoring goals and sharpen your IT operations game to become more proactive in fixing your customers’ issues before they are affected.

From all of us at ManageEngine Site24x7 and Zoho, we wish you all a happy new year and prosperous 2023.

The Site24x7 team

Like (4) Reply

Replies (2)