Preliminary Results on the use of Artiﬁcial Intelligence for Managing Customer Life Cycles

—During the last decade we have witnessed how artiﬁcial intelligence (AI) have changed businesses all over the world. The customer life cycle framework is widely used in businesses and AI plays a role in each stage. However, implementing and generating value from AI in the customer life cycle is not always simple. When evaluating the AI against business impact and value it is critical to consider both the model performance and the policy outcome. Proper analysis of AI-derived policies must not be overlooked in order to ensure ethical and trustworthy AI. This paper presents a comprehensive analysis of the literature on AI in customer life cycles (CLV) from an industry perspective. The study included 31 of 224 analyzed peer-reviewed articles from Scopus search result. The results show a signiﬁcant research gap regarding outcome evaluations of AI implementations in practice. This paper proposes that policy evaluation is an important tool in the AI pipeline and empathizes the signiﬁcance of validating both policy outputs and outcomes to ensure reliable and trustworthy AI.


I. INTRODUCTION
AI is having a significant impact on businesses worldwide. One example is AI ranking systems in the content creation industry. This adoption has resulted in a transformation of the way content is produced, with the goal of ranking higher by AI models and gaining a competitive advantage. This is a good example of implicit policy impact, in which the intended outcome, information ranking, represents just one of the actual outcomes. The effect of such changes are beyond the scope of this article, but it is critical to emphasize the difference of intended and actual policy outcomes.
AI is not a goal in itself; rather, it is a tool with the purpose of adding value. When businesses use AI to gain a competitive advantage, organizational efficiency, effectiveness, and customer relations, evaluating the model performance, i.e. policy output, is just as important as evaluating the actual policy outcome. Decisions made by AI should be held to the same standards as those made by their human counterparts.
As AI systems become more advanced and widely used in businesses, it is essential to ensure that they are making decisions that align with the values and goals of the organization, and that they are not causing unintended harm or bias. By holding AI systems to the same standards as human decision-makers, businesses can ensure that their AI policies and models are transparent, accountable, and trustworthy.
This can help to build trust with customers and stakeholders, and ultimately lead to greater success and sustainability for the business.
In the telecom industry, the term "churn" is frequently used to describe when a consumer cancels a subscription. No matter how effective an AI model is at predicting churn, it does not, by itself, benefit the company. However, the model does add value when it is part of churn prevention efforts. The output, in this case, can be described as ranking customers by churn propensity, while the desired outcome is to lower the churn rate. Suppose a decision maker designs a policy where a discount is offered to the highest churn-risk customers each month. A model with a high recall but low precision may result in the treatment of customers who were not going to churn but now generate less revenue. In contrast, a model with high precision but low recall target only customers who will churn regardless of the discount, and the treatment has no effect. Even with perfect predictions, the sensitivity to proactive retention treatments must be taken into account to ensure the best possible outcome [1]. To avoid adversarial effects, the policy must be carefully considered and evaluated in relation to the desired outcome.
The use of AI to improve customer life cycles is not a novel concept. Churn prediction models, for example, have been around since the early 2000s [2]. Based on the results presented from the latest research, it is tempting to conclude that highly accurate propensity or ranking models imply higher business value. However, policies based on AI models may not produce the desired results in practice. The industry is eager to adopt AI, but coverage of quantitative evaluations of AI based policies are almost non-existent. It is necessary to understand the risks and impact of AI, wherever it is implemented and the customer life cycle is especially exposed as it often involves direct interaction with customers.
The objective of this study is to highlight the importance of policy evaluations in the context of customer life cycles utilizing AI and machine learning (ML). Through an analysis of recent literature, we seek to demonstrate the need for evaluating the effectiveness of AI-powered policies and decision support models. By emphasizing the significance of policy evaluations, this study aims to promote the development of more reliable and trustworthy AI practices in customer life cycle management. Hence, this study aims to answer the following research questions: RQ1: What conclusions can be drawn from the literature regarding how AI is applied throughout the customer life cycle? This article argues that AI has the ability to empower customer life cycle management, but its successful implementation requires a careful balance between automation and human-in-the-loop, as well as a commitment to ethical AI practices. The main contributions in this paper are the following: • First, we propose the use of policy evaluation to assess the impact of AI on the customer life cycle. • Second, from an industrial perspective, a research gap is identified and motivated as a direction for future research. • Third, this paper provides a high-level overview of how AI is applied and how its impact is measured throughout the customer life cycle.

II. BACKGROUND
Customer relationship management (CRM) focuses on building and maintaining relationships with customers to maximizing their lifetime value. It is a term that encompasses cross-functional properties such as multichannel integration, customer interaction, data collection and analytics [3]. Customer life cycle management (CLM) is a broader approach, concerned with managing the entire customer life cycle to deliver the overall best possible customer experience.

A. Customer life cycle
The definitions of what stages should be included in the customer life cycle have varied over time. However, there are a few stages that are consistently covered in the most recent research Figure 1.
Reach is about brand awareness and equity. Making sure all potential customers are aware of the products and services offered, and what solutions the company provides to solve their problems. Customer profiling, segmentation, and targeting fall under this category [4].
Acquisition is when the customer takes action to engage in communicating and building a relation. How this looks depends on the channel, e.g. visiting the company website or calling a sales representative is handled differently. Product recommendations, personalization, as well as lead ranking are applied in this stage [4]. Conversion happens when the prospect finalizes a purchase and becomes a value-added customer [4]. Dynamic pricing and content curation based on the buyers journey are two examples of conversion improvements. Retention, loyalty or advocacy is where you want the customer to remain. Retaining a customer is often many times cheaper than acquiring new ones. Up-and crosssell, customer support, and co-creation are common activities here. Satisfied and loyal customers become brand advocates, expanding the reach and completing the circle. Churn is defined as the percentage of customers that decide to exit, i.e, the customer stops buying products or services from a company [5]. In this analysis, churn is considered a separate stage in the life cycle as it also comes with its separate actions, e.g. trusted advisor, personalized offers, exit management, and re-targeting previous customers.

B. Policy evaluation
A policy evaluation is the process of objective, systematic, and empirical review of the effects a policy has on its target. How it is performed depends on the policy and the result it tries to achieve.
Formative analysis assures that a policy is feasible and appropriate before it is implemented fully. This could be a conceptual framework from which conclusions about the outcome are inferred. Process evaluation makes sure that the policy has been implemented correctly, activities are carried out, and are reaching the targeted population as intended. Process evaluation can highlight implementation issues early but does not provide insights into the effects of the policy. Cost-benefit analysis sums the expected future reward of an action and subtracts the expected cost. This is rather straight-forward when the goal is financial and is commonly used in businesses. However, it may be misleading when the benefits are partially or entirely intangible. Impact analysis is an evidence-based method that tries to answer what the effects are of an action compared to inaction. Causal relationships have to be defined and tested to answer what outcomes are directly attributable to the policy. Qualitative evaluation is useful to determine effects on opinions, attitudes, motivations or experiences. Usually, it takes the form of surveys, questionnaires, focus groups or interviews.

III. METHODOLOGY
To answer the research questions, a review of the literature was conducted. Scopus was chosen as the bibliographic database due to its extensive coverage of peer-reviewed papers in data science. The search query was developed with the intention of achieving high recall and capturing all stages of the customer life cycle. It also emphasizes AI implementations and empirical findings. The query's first component contains keywords related to the customer life cycle, these were selected based on a literature review [4] as well as gray literature [6], [7]. The second section contains keywords related to AI and relevant subdomains. The query, when executed (January 27, 2023), returned 224 results.
TITLE-ABS-KEY(customer w/5 (lead OR prospect OR reach OR acquisition OR conversion OR retention OR churn OR "life cycle" OR relationship OR experience OR journey) AND ("artificial intelligence" OR "machine learning" OR "big data" OR "expert system" OR "deep learning" OR ai OR ml OR dl) AND empirical * ) Each result was screened based on title and abstract using the following criteria: • Must cover one or more stages of the customer life cycle, • The article must be peer-reviewed with available, nonretracted, full-text in English, • The article must be published in a journal, workshop, or conference, • Implements AI or subdomains, e.g. ML, in one or more stages in the customer life cycle, • Presents empirical evidence, excluding surveys, questionnaires, and interviews 1 , • Excluding conceptual or theoretical papers, e.g. comparing or evaluating the performance of specific AI implementations, data collection or training. Figure 2 shows how the screening and analysis process was conducted. The titles and abstracts were read and compared to the above-mentioned inclusion criteria. The full text was obtained if the abstract did not reliably exclude the paper. This stepȃdecreased the number of publications from 224 to 48. For the remaining articles, the complete text was reviewed and assessed again using the inclusion criteria and research questions. Following this stage, the final set contains 31 relevant articles. Table I shows the distribution of journals and conferences per article, as well as figure 3 which shows the distribution per year. Table II shows the method applied per stage in the customer life cycle. Only five of the 31 articles included any type of policy evaluation. These includes formative analysis based on simulations (P t ), and more practical methods such as randomized tests (P p ). This is a clear indication that 1 Surveys are excluded since they focus on subjective experiences rather than objective outcomes, which makes them ineffective for answering the research questions. Surveys may also be subject to sampling errors, measurement bias, and high cost due to the manual handling [8]  future research must focus on theoretical and practical policy evaluations.

A. Reach & Acquisition
The two first stages of the customer life cycle are reach and acquisition. The distinction between the two are described in the background in Section II-A, however, in practice there are overlaps. The literature usually look at the two stages combined when implementing AI, for instance customer targeting and customer response rate [9], or early customer life time value forecasting [10].
Haupt and Lessmann used statistical models and ML to improve the targeting policy of an e-coupon campaign [9]. The authors analyzes the targeting decision problem and argues that the treatment cost is not fixed in practice but is dependant on the customer response probability. The derived  policy was tested by simulating the stochastic variables based on real world data and the result was an increase in the overall campaign profit. Traditional marketing channels are typically comprised of broadcasts; these channels have the potential to reach a large number of customers. However, compared to unicast channels, broadcasts are expensive and not very efficient. The problem with unicast channels is to find and target the right customers. Customer segmentation can improve the effectiveness of targeted marketing. The assumption is that there exists groups of customers that have similar characteristics and therefore are more likely to purchase similar products. Several studies have shown varying success at implementing AI for this purpose [11]- [14]. In practice, this usually means that the segmentation clusters are manually labeled and paired with a marketing campaign based on perceived personas. However, little is known about the actual outcomes of customer segmentation-based targeting or other uses of customer segmentation.
Customer lifetime value (CLV) can drive targeting policies to focus resources on the most valuable clients and therefor also competitive advantage and profit growth [15]. CLV is typically based on demographics and historical purchase data. Forecasting CLV based on historical purchase data only works after you have gathered enough data about a specific customers' spending pattern. This inability to make inferences early is often called a "cold start" or "bootstrap" problem. Padilla and Ascarza [10] developed a modeling framework called First-Impression Model (FIM) which can identify high-value customers at the acquisition stage and thereof, alleviating the cold start problem of CRM. The authors evaluate their model by running simulations based on real data and show promising empirical results. They discuss policies that could be implemented in practice for e.g. customer targeting based on the predictions of FIM. However, the evaluation of real-world policies were left to future research.

B. Conversion
Conversion is the stage where a prospect becomes a value-added customer. A strong application of AI within conversion is dynamic pricing. The desired output of such models are to predict the price for which the customer most likely will convert. Pricing strategies are typically developed through supply and demand modeling, in which both supply and demand participate in moving the price one way or another [16]. In some studies the supply side is known or presumed fixed, which lets the AI model focus on predicting the demand [17] or vice versa. However, the forecast would not by itself affect the desired outcome which is increased profit. When defining a pricing strategy based on forecasting demand, supply or both; it is important to factor the risks of setting the price too low, or too high. The policy should be evaluated based on the desired outcome e.g. increased profits.
Some studies examined user behavior from various touchpoints. If touchpoints can adapt to individual user behavior, conversion rates may increase. In practice, this usually entails generalizing across a population in order to infer individual behavior patterns [18]. Alfian et al. [19] proposes a statistical model for extracting association rules from real-time online transactional data. The model was theoretically evaluated by the authors in a simulated environment. In which the model successfully predicted customer behavior and increased sales of recommended products.
Recommendation systems were initially developed to improve the information retrieval process. From a business perspective this usually means assisting the customers in finding the most relevant products. These may manifest as search intent models [20] or dynamic content curation based on the customers buying cycle. However, the recommendations could just as well have the opposite effect if the choice of recommendation engine does not fit the purpose or is badly implemented [21].

C. Retention
When a prospect converts to a value added customer they enter the retention stage. Many activities go into customer retention, and success in retaining customers can be measured by the churn rate. As stated in the background II-A, churn is treated as a separate stage from retention activities in this analysis.
Churn prevention is executed when there are clear indications of churn, whereas retention activities focus on building loyalty and a healthy relationship with the customer. Hedonic aspects, such asȃcustomer experience and satisfaction,ȃare important for a healthy customerȃrelationship. Customer loyalty however, requires a business to consistently delivery a positiveȃexperience [22]. Customer service and support play a critical role in ensuring that problems voiced by customers are resolved in a timely and satisfactory manner. Customer loyalty may even increase after reporting a problem, if managed properly [23].

1) Experience & Satisfaction:
Experience and satisfaction are difficult to quantify because they are subjective. Customers may find it difficult to express their experience due to the subjective properties. Customer experience (CX) measurement are however essential to businesses to ensure that customer journeys align with the company's vision and enable long-term trust and co-creation. Understanding the customer's emotions are key to also understand their behaviour [24].
Sidaoui, Jaakkola and Burton [25] proposes a chat bot framework that is based around "narrative inquiry via data collection mechanisms such as storytelling and interviews" to improve primary experience gathering. Deng and Murari [26] suggests that Crowdsourced Voice Feedback is a more reliable, customer-centric, less expensive and instantaneous method of measuring CX when using Intelligent Voice Assistants (IVA). With the use of ML based causal inference methods, they discovered that prompting customers for feedback had no negative impact on CX; however, when the elicitation frequency increased the response rate decreased. The authors also investigated the causality between response rate and types of questions asked and when they were asked. Lee, Tse, Zhang and Ma [27] extracted insights from customer reviews of short-term homestays and experiences using statistical methods. In theory, the method they present can provide hosts with valuable customer-centric and immediate insights, improving product quality and customer experience. However, the authors do not address potential risks or consequences in an online real-world setting.
Yu et al. [28] developed a decision system based on ML to optimize the delivery scheduling of food deliveries. The system controls the order delay when the probability of grouping orders of similar destinations are high, thus increasing the delivery efficiency. It is at the same time constrained by the probability of overtime penalties and average delivery time. The resulted policy was tested using an online A/B test in several cities. The decision system was compared to a fixed 2 minute delay policy and showed an improved grouping success of 41.20% compared to 22.19%. The article did not discuss explainability aspects or potential risks with variance or bias towards some types of orders compared to others.
2) Service & Support: Andrade and Moazeni describe a statistical model that predicts interactive voice response (IVR) transfer rates with an area under the curve between 77% and 95% depending on the callers location [29]. They present theoretical impacts, e.g. the model could improve customer satisfaction by bypassing the self service IVR solution, but does not evaluate any practical policy based on this assumption. It could be argued that a simpler policy for achieving this goal is to make the self service IVR optional, considering that the authors also stated that "dealing with automated customer service platforms before reaching to an agent is considered to be the most frustrating part of a poor contact experience".
Being able to forecast obsolescence enables a business to better allocate resources for e.g. stocking up on parts and repairs. The idea is that being proactive in this area allows the business to cut costs by improving the organizational efficiency, and at the same time provide a better customer experience. AI can be employed to predict the obsolescence date of products based on individual wear and tear [30]. Little is known about the magnitude of the impact by forecasting obsolescence to prioritize preventative actions and avoid costs.
Failures will occur despite how diligently a company tries to avoid them. There is a constant balance between risk mitigation and cost containment. There are factors that influence this balance in both directions, and it all boils down to how much the customer is willing to pay. Kim et al. [31] analyze these factors in detail and how they impact willingness to pay (WTP). The authors demonstrate that ML can be used to approximate the customer damage function (CDF) and discuss how it can be used in practice to improve the product offerings. Future research will show how such policies manifests in the real world.

D. Churn
Customer churn is defined as the percentage of customers that decide to exit, i.e the customer stops buying products or services from a company. All businesses eventually reach a point where market saturation makes acquiring new customers more expensive than retaining existing customers [5], [32]. Businesses operating in new markets or startups place less emphasis on churn rate while growth is still cheap. Churn metrics also indicates the overall satisfaction and willingness to do repeated business. Churn rate is perhaps most notable in companies that offer services or subscriptions, e.g. telecommunications, cloud or streaming media companies. In the literature there were mainly two forms of churn modelling presented, prediction and prevention models. However, most research focus on the churn prediction problem and little is known about the effects of churn prevention policies.
1) Prediction: Forecasting churn can be very effective, but the data is difficult to collect and the available data is naturally very imbalanced as the churn rate preferably should be as low as possible [32], [33]. It is also difficult to capture all the relevant features as both internal and external events can affect the reason to churn [34], [35]. There appears to be no widely used benchmark dataset for this purpose, making it difficult to compare literature. Nonetheless, the churn prediction problem has received considerable attention [36]- [40].
2) Prevention: Based on churn predictions, policies can be formed to reduce the churn rate, i.e. churn prevention. This step is often overlooked by the available literature. It is not enough to present a model with high evaluation scores for it to be useful in practice. The model must support a policy that can identify the right customers and the right treatment to lower the churn rate. Customer respond differently to different treatment, therefore, individual treatment effect (ITE) is an essential driver in the targeting decision to ensure an optimal outcome. Using the naive approach and targeting high risk churners while disregarding ITE may even have the opposite effect [1]. The possibility to alter the churn prediction optimizer for either precision or recall enables managers to better tailor the model to better suit their specific targeting goals [5]. There are trade-offs when using models based on imperfect information 2 that needs to be acknowledged during the targeting decision.

V. DISCUSSION
Customer life cycles are broad and touch on so many different domains that it is difficult to capture all of them in a single review. Even though the database query was designed for recall it may return more results if each stage were queried separately, although this would presumably also increase the required effort to review greatly. The authors acknowledge that the query is limited to articles using the term "empirical*" that may not fully represent the entire set of empirical work. This paper is also limited to business applications of AI.
Ranking models have become a very popular tool for improving information retrieval and recommendations. In a digital age where data is abundant, the ability to quickly and effectively retrieve relevant data is demonstrated at several stages in the customer life cycle. It provides customer targeting and segmentation for improving reach and acquisition. Dynamically curating web pages and recommending relevant products provide increased conversion. The ranking may improve retention and loyalty by aiding customer service in retrieving relevant information to solve problems quicker. However, it may also target consumers who aren't qualified as long-term loyal customers or offer solutions that don't genuinely fix their problem, which has a negative impact on customer relations. A system that automatically gives out discounts based on a customer's churn probability score may encourage customers to churn more frequently in the long run. These examples demonstrate the effects a flawed policy could expose in a real-world setting.
Customer acquisition cost, customer satisfaction, and lifetime value are rarely discussed alongside propensity to buy, customer segmentation, and targeting models. This may not be a problem if the company's only goal is to increase sales or market share. But in reality, it is much more likely that customers are valued different for strategic and long-term growth purposes. It may be reasonable to take a chance on a high-value customer if the treatment cost is not prohibitively costly in relation to the conversion probability, even if that probability is lower than that of the alternative low-value customers. There may also be ethical considerations that influence whether or not to associate with certain customers over others, such as when the customer's reputation conflicts with the company's values. The desired outcome must be the primary consideration when deciding on the metrics of success to use for AI implementations. These trade-offs are not discussed to any greater extent in the literature.
Churn prevention as a policy based on churn prediction appears to be mostly overlooked by the literature. Consider the combination of a churn prediction model, which feeds a treatment response model to determine the most effective treatment, which in turn feeds a treatment cost prediction model. This problem is similar to the e-coupon targeting presented by Haupt and Lessmann as described in section IV-A.
There are several ways to evaluate policies, as described in section II-B. For churn prevention, a formative analysis could be done using historical data to identify patterns and trends in customer behavior that may be contributing to churn. Cost-benefit analysis may prove to be complex, as the cost of the treatment can vary based on the accuracy of the prediction model, as inaccurate predictions may lead to incorrect treatments, resulting in higher costs. Accurately assessing the costs and benefits of an intervention requires consideration of both the treatment cost and the response rate which are both dependent on the accuracy of the model. The gold standard for impact analysis is A/B testing which can identify which policies are most effective at achieving their desired outcomes, while also mitigating potential harms or unintended consequences. A/B testing does require interventions that entails risks which may not be acceptable in practice, for those cases one could look at e.g. double robust learners which works on observational data. Finally, a qualitative assessment could include methods such as focus groups, interviews, surveys, and net promoter score (NPS) to gather feedback on the experience of an AI-powered system and how it affected the decision-making process.
The choice of optimization is imperative to achieve desired outcomes. When employing AI to automatically elicit customer feedback [26] or mining insights from product reviews [41], the policy output is when to trigger elicitation and the similarity score of product reviews respectively. However, the desired policy outcome is an improved customer agility and experience. The effectiveness of this outcome in practice depends on what the AI is optimized for. In order to obtain accurate insights, the data must be representative of the population. Regarding feedback elicitation, optimizing for response rate may have a negative impact on the total number of responses as only the customers that are guaranteed to respond will be elicited. On the other hand, optimizing on the number of responses may elicit feedback too often and reduce both the response rate and customer experience. Zhou et al. demonstrated that there are both positive and negative impacts on product performance depending on how the reviews are used. Some products do benefit from high agility, while others may suffer from rapidly changing behavior. Only optimizing for the agility metric may not result in deliveries that actually improve the customer experience and product performance.

VI. CONCLUSIONS AND FUTURE WORK
In summary, AI implementations have been mapped to each of the customer life cycle stages and thoroughly analyzed. Theoretical and practical implications of AI were discussed for each stage. AI model performance evaluations were frequent, however, the model performance does not necessarily translate to the expected real-world outcomes. The evidence for successful and acceptable policy outcomes were almost non-existent. Successful customer life cycle management takes a broad view and responsibility for the entire customer experience. The effects of AI on the overall experience were not presented in the literature. This concludes the answer to RQ1.
Only five papers that studied the outcomes of AI policies were found in this analyzis. Three of which were formative [1], [5], [9] and the other two performed an online impact analysis [28] and cost-benefit analysis [26]. For industries looking at employing AI in their business, policy evaluations supports the business case and trust. This review highlights a research gap when it comes to evaluating the outcomes of AI in customer life cycle management. Policy evaluations are helpful in assuring stakeholders that the AI not only performs well on paper, but also works as intended, with acceptable outcomes and with no adversarial biases or ethical concerns. The significance of human-centered evaluative research is critical to business ethics and society as AI becomes more prevalent in real-world settings. This concludes the answer to RQ2.
Future work can branch into several directions. First, there is a need to establish an unambiguous and easy-to-follow framework for how a business should perform policy evaluations on AI implementations. What are the steps, necessary tools, metrics, terminology, and processes involved? How can policy evaluations be part of the AI pipeline from start to production?
Second, guidelines for trustworthy and ethical AI in businesses should be included in the policy evaluation process.Transparency is key to growing trust, but further research is necessary to understand how transparency is best communicated. In what ways should a business expose that a decision was made by AI? Is it possible to challenge the decisions? In what ways can a customer voice concerns or problems they experience from AI decisions?
A set of standard datasets for evaluation is required to further enrich the literature on AI in customer life cycles. To my knowledge, no widespread standard evaluation frameworks exist for any of the steps in the customer life cycle, such as customer segmentation, dynamic pricing, or churn prevention.