Search
Services
- Subjects
- Nursing
- Law
- Management
- Economics
- Engineering
- IT & Computer Science
- Marketing
- Accounting
- Statistics
- Finance
- View All
- Samples
- Nursing
- Law
- Management
- Economics
- Engineering
- IT & Computer Science
- Marketing
- Accounting
- Statistics
- Finance
- View All
- Countries
- Australia
- USA
- UK
- Canada
- UAE
Study Material
Blog
Offers
Reviews 4.5/5
Support
- Help & Support
- Highlights
- Certified Experts
- About Us
- FAQs
- Our Policies
- Ask your Question
- Contact Us
- Request Callback
Order Now
My Account

Quick Searches

Prices are LOW don't be SLOW - Order Now

BUS5PB Managing Ethical Data in Big Data System Assignment Sample

Subject Code : BUS5PB
University : La Trobe University My Assignment Services is not sponsored or endorsed by this college or university.
Subject Name : Business Analytics

Introduction to Principles of Business Analytics

The big data system is the future of business. Banking organizations, that deals with a large number of data, are considering this system already. Many e-commerce sites are also considering this system. The system work on the principle of data training (Ekong & Vihinen, 2019). However, it creates implicit bias. Word2vec and Glove are already notorious for affecting the personal interest of people. It has been marked as racist and sexist by many clients. Thus, the researcher will analyze the validity of the same. The researcher will further find the obligations of this system in the banking industry. Based on these obligations, the researcher will find proper solutions to use a big data system and its precautions.

Principles of Business Analytics - Task 1

Part a) Application of word embedding in Business:

Word embedding is the representation of a word with a vector form to ease data mining and data sorting. It is a part of the natural learning process (NLP) in machine learning. It can be applied in business as mentioned below.

Word2vec

Benefits and motivation: Many organizations fail to review the customer’s response from the data survey. The conventional survey analysis tool is orthodox and based on the theory of average. However, Word2vec may point out specific customer's response. For example, a company is looking for new improvement. Using the vector model, the analyst can easily search for the number of customers who claimed for improvement.
Working principle: Each word in the data set is marked as a number in the process. As the analyst is looking for up-gradation then the analyst simply types the code of issues associated with the current system (Boté & Térmens, 2019). Thus, the system will point out the number of times a customer has complained. In lieu of this, it will show the number of customers asked for an up-gradation.
Result: poor data mining process can provide a high loss in ROI and degraded brand value. So, Word embedding will help an organization to diminish such a scenario.

Glove

Benefits and motivation: A consumer-centric organization always prefers to solve a client's query as early as possible. These organizations always prefer on the market survey. The organization fails in many aspects to understand the verbatim comments of clients (Olarewaju, 2018). As the traditional system fails, Glove may be a choice for the entrepreneurs.
Working principle: The complete data is reduced by such a number in a pattern. A verbatim word is also a number in the system. As mentioned earlier, the word is checked in the dataset as a number. Word embedding system will notify if many clients are complaining about the exact issue or not. Besides, it helps an organization to solve query for each client (O'Keefe & Brien, 2018).
Result: Better CRM policy

Part b) Implicit bias in the applications:

The positive facts choose word embedding applications as an ideal choice in business. However, it has some detrimental effects due to its origin follows some implicit issues. They are-

Sexism: A research conducted on Word2vec application showed that the application presumes that words like “sport”, “news” and “science” are linked to men. The words like “arts”, “pink”, “clothing” are always linked with girls (Clark et al. 2019).
Stereotypical bias: Research on Glove’s fail function algorithm revealed a stereotype bias in the web embedding application. For example, New York: USA:: New Delhi: India. This input showed a perfect relation in the word embedding. But when ‘Computer programmer’ was given as an input, the result was ‘men’ (Cooper, 2016). Hence, the data training process contains the ultimate implicit issue.
Ethical issues: Bias metric is used to measure the impact or origin of the bias. However, the hyper-parameters often conflict with bias-metric. Instead of a solution, another issue is generated by the previous solution. It is mainly due to the variance of these two entities are different. For organizations serving in the E-commerce industry. The previous issue presumes a male client to be fond of black iPhone while a female candidate to a lover of pink pyjamas. The choice may differ. Besides, it can affect the beliefs of some specially reserved people. The solution or bias metric method can produce another disadvantage. For example, hyper words like a man are searching for the back iPhone for the last 2 months. As data mining identifies it as bias, the system can delete the choice of the client. So, the organization may fail to identify the most prior consumers (Jennette, 2018). So, considering applications like word2vec and Glove may have positive influence rather it creates some vulnerable facts in the data science of an organization.

Part c) Alleviate the implicit bias in a word embedding

As mentioned above the gender bias has been a real challenge for Word embedding technique. However, it can be solved by considering two different methods. For example, the origin of bias in Word embedding is due to its fundamental concepts of the data learning process. According to Rigg (2018), the convectional model allows this technique to opt for the train data concept. As long as the model is subjected under operation, the model collects a total number of keywords prefers them according to the gender of the user. This concept may seem a bit unclear. Hence, taking an example. If 600 people are searching for fighter jet pilot and 500 of them are a male candidate. Then the train data approach presumes a fighter jet warrior to be a male candidate. In contrast to this, a data manager should consider data augmentation technique along with gender tagging technique to diminish the implicit bias issue associated with data training technique.

Data augmentation technique: The editing of each data should be allowed. No such possibility must be made before identifying the gender. This technique is further classified into two more subcategories. Position augmentation and colour augmentation. Rotation, cropping, clarity and contrast of a data should be maintained manually using this method. In lieu of that, the editing with the content should be permitted in this section (Collmann & Matei, 2016). So, no other entity can tag a word with a gender. This technique will not only diminish the issue but some extra advantage will be provided to a data manager. For example, colour augmentation method can provide numerous scopes to the data editor to manage the appearance of a data. Thus, along with diminishing the issue of gender bias policy.

Gender tagging: Data augmentation is sufficient to manage the implicit gender bias in a word embedding. However, extra protection is always preferable. Therefore, manual gender tagging is proposed. Besides, it can be an economic solution for companies with a low capital market. Data augmentation requires an ample amount of money and a regular cloud management process. Smaller industries can't invest a large capital regularly (Scassa & Taylor, 2017). Therefore, gender tagging is another option. Here, search content is managed manually. For example, it does not tag the warrior jet pilot as a male candidate. Rather than, it checks the image. If the image is found to a male pilot then a proper gender is tagged with the image. Similarly, for names, masculine names are tagged with the male gender. While the feminine names are tagged with female data. However, the unisex names are not tagged with any gender. Unlike the training data policy, where a probability determines the characteristics. Here, a complete assurance determines the characteristics of a data.

Principles of Business Analytics - Task 2

Part a) Benefits of using Big data system in financial organizations:

A human operator can't identify the validity of data for a million number of applicants. Each of these application is somehow related to hundreds of links. Therefore, the financial organizations are considering big data system to identify the risk associated with a loan application. It takes a minimum amount of time to provide a maximum validated result (Nair, 2020). Also, the data checks for any third-party links associated with the application. For example, in the given case study, Fred and Tamara were linked to a third-party source of data. Each of these data is analyzed by the system and no proper link has been found by the big data system. This source may be reliable as claimed by the couple. But there is an assurance that the flow of data will remain constant in future. If this flow is reduced or terminated in future, then their business may get affected. However, the enhanced big data system easily identified this issue and marked it as a medium to high-risk business.

Part b) Ethical harm in life interest faced by Fred and Tamara

Every people have a desire in life. Maslow probably describes these desires in the most scientific form. Thee basic desires are food to live, air to breathe, shelter to reside and physiological needs to suppress the basic instinct. Beyond that, a person seeks for respect, social position, a dream of career, reputation, knowledge, liberty in life, social security, economic security, entrepreneurship and many more (Cooper, 2016). A chief data manager, who analyses the loan application data, often considers the personal interests and bond of the application with the applicant. It helps one to identify the ethical life interests. However, a software-driven data analysis does not consider any ethical interest of a person. The same has been faced by Tamara and Fred.

Liberty in life: Tamara and Fred desired their life to be "their own boss" as mentioned in the case study. However, the loan application has been rejected by a soulless algorithm system. Therefore, their dream to be an entrepreneur is no longer looking possible. The advanced data learning system in big data analysis have marked the previous result. If they apply again then the application will be rejected again for its record.

Respect: Tamara and Fred believed that a business can provide respect in society (Beretta et al. 2018). However, the rejected application put a barrier in front of their respectful life. Also, the reason of loan application is mentioned as improper source odd data. It may harm their respect as they are applying loan for illicit business.

Economic security: The desire of Tamara and Fred may find a way to manage their financial requirement for the business. However, a loan from a bank is always an assurance in business. The security for the business and life may be destroyed by the algorithm system as it rejected the loan application.

Social security: It is obvious that being an entrepreneur is a better position in society than being employed. Therefore, this social security was a part of Fred and Tamara's life interest. This rejected loan application has also provided harm in this area.

Reputation: As their existing consultant already mentioned that their loan application draft was perfect. They probably had notified their friends, family members and other keen persons to them. These people are the most important portion of life interest (Beretta et al. 2018). If they again tell these people about the reject of the loan. Then it may affect their reputation as well.

Part c) Harm in society due to this system:

The big data analysis is dependent on the training data concept. As the loan has been marked as risk loan. All the entities associated with this loan is considered a part of a medium to a high-risk member (Scassa & Taylor, 2017). For example, Fred and Tamara acted as a stakeholder to another person’s business. The other person is now applying for the loan. As the system detects Fred and Tamara in the data, the system will alarm a bell of risk. Hence, it may provide immense harm to every individual linked with these two people.

In addition to this, the default nature of implicit bias of the training data concept may target any third-party data as risk data. If two or more such cases are detected by the system, then the system will never analyze the third-party data and presume it as a risk data due to its implicit bias issue.

Part d) Preventions to the Harms due to the big data system

As the issues identified in section 2b and 2c, the researcher recommends using three different solutions to be adopted by a financial organization.

validations and testing: Every time, a loan is marked as risk, there should be a data model testing program. This program will set the values of risk to default so that improper data training may produce future harm to the society as mentioned above. Besides, a validation model will permit to justify the risk mentioned by the system. In the conventional system, if the loan applicant reapplies for the loan then this validation model may reconsider the validity of the data and pass the result to the system (Jennette, 2018).

In addition to this, the model will recover the names of the applicants and its stakeholders. The default value of the these will be reset and no such presumption will be made in future.

Human accountability: It is a matter of question to find the answer to "who will be designated as responsible to manage the ethical data?" "Will it be a system or a human?". In this assignment, the researcher mentioned the life interest associated with ethical data. However, a system cannot measure the concept of ethical data (Collmann & Matei, 2016). This system follows an algorithm and provides a binary result. It may be either affirmative or negative. No tristate in present here. To minimize the time a bank can consider big data practice. However, the organization must a designated human team. This team will analyze the risks mentioned by the system. The team will be a safeguarding of the life interest of an applicant. This team will further analyze third parties and the link between a third and applicant.
Personal, economical and social business practice method: The team mentioned above will be working on three different types of data. Therefore, this person will be trained accordingly.

Social data: The team will analyze the personal life interests and attachment of the applicant in social life. Also, the impact of business in society will be analyzed here. In many cases, a social or moral business concept is found marginally profitable (Scassa & Taylor, 2017). However, SCR should be considered by the financial organization for these types of businesses.
Economical data: Testing should be performed to identify the capability and validity of stakeholders. As in this case study, a vague relationship is identified by the software. However, the team will find the proper relationship and capacity of this stakeholder.
Personal data: Here, the team will analyze personal information. Such as the moral rights, legal rights and desires of the applicant (Collmann & Matei, 2016). In addition to this, the credit score of the applicant will also be attached here. Based on these methods only. a bank should use a big data system to analyze ethical data.

Conclusion on Principles of Business Analytics

In this assignment, the researcher has initiated the implicit issues associated with machine learning systems. It can bee a great threat to analyze ethical data. As the system is designated to abide by the rules of its algorithm, it may never consider the social, economic and personal impact of a person. Thus, human accountability with a trained professional is highly recommended in this case.

References for Principles of Business Analytics

Beretta, E., Vetrò, A., Lepri, B., & De Martin, J. C. (2018, September). Ethical and Socially-Aware Data Labels. In Annual International Symposium on Information Management and Big Data (pp. 320-327). Springer, Cham.

Boté, J. J., & Térmens, M. (2019). Reusing Data: Technical and Ethical Challenges. DESIDOC Journal of Library & Information Technology, 39(6).

Bourhis, P., Demartini, G., Elbassuoni, S., Hoareau, E., & Rao, H. R. (2019). Ethical Challenges in the Future of Work. Data Engineering, 55.

Clark, K., Duckham, M., Guillemin, M., Hunter, A., McVernon, J., O’Keefe, C., ... & Waycott, J. (2019). Advancing the ethical use of digital data in human research: challenges and strategies to promote ethical practice. Ethics and Information Technology, 21(1), 59-73.

Collmann, J., & Matei, S. A. (Eds.). (2016). Ethical Reasoning in Big Data: An Exploratory Analysis. Springer.

Cooper, H. (2016). Ethical choices in research: Managing data, writing reports, and publishing results in the social sciences. American Psychological Association.

Corrall, S., & Currier, J. D. (2017). Ethical Issues of Big Data 2.0 Collaborations: Roles and Preparation of Information Specialists.

Ekong, R., & Vihinen, M. (2019). Checklist for gene/disease‐specific variation database curators to enable ethical data management. Human mutation, 40(10), 1634-1640.

Firmani, D., Tanca, L., & Torlone, R. (2019). Ethical Dimensions for Data Quality. Journal of Data and Information Quality (JDIQ), 12(1), 1-5.

Jennette Chalcraft CPA, C. A. (2018). Drawing ethical boundaries for data analytics. Information Management, 52(1), 18-25.

Kwan, K., Schneider, J., & Ullman, J. S. (2019). Decompressive craniectomy: Long term outcome and ethical considerations. Frontiers in neurology, 10, 876.

Nair, H. L. K., Ten Wong, D. H., Fouladynezhad, N., Yusof, N. N., Chong, M. C., & Maarop, N. (2018). Big Data: Ethical, Social and Political Issues in Telecommunication Industry. Open International Journal of Informatics (OIJI), 18-25.

Nair, S. R. (2020). A review of ethical concerns in big data management. International Journal of Big Data Management, 1(1), 8-25.

O'Keefe, K., & Brien, D. O. (2018). Ethical data and information management: concepts, tools and methods. Kogan Page Publishers.

Olarewaju, O. M. (2018). Ethical Data Management and Research: Managing Ethical Issues for Research Integrity in Education. In Ensuring Research Integrity and the Ethical Management of Data (pp. 209-218). IGI Global.

Rigg, T. (2018). The ethical considerations for storing client information online. Professional Psychology: Research and Practice, 49(5-6), 332.

Scassa, T., & Taylor, F. (2017). Legal and ethical issues around incorporating Traditional Knowledge in polar data infrastructures. Data Science Journal, 16.

Remember, at the center of any academic work, lies clarity and evidence. Should you need further assistance, do look up to our Business Analytics Assignment Help