OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

How to show plotly chart in kedro?

  • Thread starter Thread starter Ryo Matsuzaka
  • Start date Start date
R

Ryo Matsuzaka

Guest
I am trying to use data science tool kedro according to this tutorial.
I followed the instruction(write config.yaml, node.py and pipeline.py etc) and do exactly the same as the documentation) and could run kedro run successfully.
And next step, I tried kedro viz and could show the pipelines but I cannot see plotly chart.
Here is the result of the visualization. Please see the left pane. I can see Shuttle Passenger Capacity Plot but it is not activated and plots does not show up.
enter image description here

Also, I set conf/base/catalog.yaml to output json file to load for plotly but I cannot see any in 08_reporting directory. This could be the cause of the issue?

enter image description here

Update​


nodes.py and pipeline.py is located here. enter image description here

nodes.py

Code:
import pandas as pd


def _is_true(x: pd.Series) -> pd.Series:
    return x == "t"


def _parse_percentage(x: pd.Series) -> pd.Series:
    x = x.str.replace("%", "")
    x = x.astype(float) / 100
    return x


def _parse_money(x: pd.Series) -> pd.Series:
    x = x.str.replace("$", "").str.replace(",", "")
    x = x.astype(float)
    return x


def preprocess_companies(companies: pd.DataFrame) -> pd.DataFrame:
    """Preprocesses the data for companies.

    Args:
        companies: Raw data.
    Returns:
        Preprocessed data, with `company_rating` converted to a float and
        `iata_approved` converted to boolean.
    """
    companies["iata_approved"] = _is_true(companies["iata_approved"])
    companies["company_rating"] = _parse_percentage(companies["company_rating"])
    return companies


def preprocess_shuttles(shuttles: pd.DataFrame) -> pd.DataFrame:
    """Preprocesses the data for shuttles.

    Args:
        shuttles: Raw data.
    Returns:
        Preprocessed data, with `price` converted to a float and `d_check_complete`,
        `moon_clearance_complete` converted to boolean.
    """
    shuttles["d_check_complete"] = _is_true(shuttles["d_check_complete"])
    shuttles["moon_clearance_complete"] = _is_true(shuttles["moon_clearance_complete"])
    shuttles["price"] = _parse_money(shuttles["price"])
    return shuttles


def create_model_input_table(
        shuttles: pd.DataFrame, companies: pd.DataFrame, reviews: pd.DataFrame
) -> pd.DataFrame:
    """Combines all data to create a model input table.

    Args:
        shuttles: Preprocessed data for shuttles.
        companies: Preprocessed data for companies.
        reviews: Raw data for reviews.
    Returns:
        Model input table.

    """
    rated_shuttles = shuttles.merge(reviews, left_on="id", right_on="shuttle_id")
    model_input_table = rated_shuttles.merge(
        companies, left_on="company_id", right_on="id"
    )
    model_input_table = model_input_table.dropna()
    return model_input_table


import plotly.express as px
import pandas as pd
import plotly.graph_objects as go


# the below function uses plotly.express
def compare_passenger_capacity(preprocessed_shuttles: pd.DataFrame):
    fig = px.bar(data_frame=preprocessed_shuttles.groupby(["shuttle_type"]).mean().reset_index(), x="shuttle_type", y="passenger_capacity", )
    return fig

pipeline.py

Code:
from kedro.pipeline import Pipeline, node
from kedro.pipeline.modular_pipeline import pipeline

from .nodes import create_model_input_table, preprocess_companies, preprocess_shuttles, compare_passenger_capacity


def create_pipeline(**kwargs) -> Pipeline:
    return pipeline(
        [
            node(
                func=preprocess_companies,
                inputs="companies",
                outputs="preprocessed_companies",
                name="preprocess_companies_node",
            ),
            node(
                func=preprocess_shuttles,
                inputs="shuttles",
                outputs="preprocessed_shuttles",
                name="preprocess_shuttles_node",
            ),
            node(
                func=create_model_input_table,
                inputs=["preprocessed_shuttles", "preprocessed_companies", "reviews"],
                outputs="model_input_table",
                name="create_model_input_table_node",
            ),
            node(
                func=compare_passenger_capacity,
                inputs="preprocessed_shuttles",
                outputs="shuttle_passenger_capacity_plot",
            ),
        ],
        namespace="data_processing",
        inputs=["companies", "shuttles", "reviews"],
        outputs="model_input_table",
    )

Reference: https://kedro.readthedocs.io/en/stable/tutorial/visualise_pipeline.html
<p>I am trying to use data science tool <code>kedro</code> according to <a href="https://kedro.readthedocs.io/en/stable/tutorial/visualise_pipeline.html" rel="nofollow noreferrer">this tutorial</a>.<br />
I followed the instruction(write config.yaml, node.py and pipeline.py etc) and do exactly the same as the documentation) and could run <code>kedro run</code> successfully.<br />
And next step, I tried <code>kedro viz</code> and could show the pipelines but I cannot see plotly chart.<br />
Here is the result of the visualization. Please see the left pane. I can see <code>Shuttle Passenger Capacity Plot</code> but it is not activated and plots does not show up.<br />
<a href="https://i.sstatic.net/rTeKe.png" rel="nofollow noreferrer"><img src="https://i.sstatic.net/rTeKe.png" alt="enter image description here" /></a></p>
<p>Also, I set <code>conf/base/catalog.yaml</code> to output json file to load for plotly but I cannot see any in <code>08_reporting</code> directory. This could be the cause of the issue?</p>
<p><a href="https://i.sstatic.net/CAIhu.png" rel="nofollow noreferrer"><img src="https://i.sstatic.net/CAIhu.png" alt="enter image description here" /></a></p>
<h2>Update</h2>
<p><code>nodes.py</code> and <code>pipeline.py</code> is located here.
<a href="https://i.sstatic.net/dDzcK.png" rel="nofollow noreferrer"><img src="https://i.sstatic.net/dDzcK.png" alt="enter image description here" /></a></p>
<p><strong>nodes.py</strong></p>
<pre class="lang-py prettyprint-override"><code>import pandas as pd


def _is_true(x: pd.Series) -> pd.Series:
return x == "t"


def _parse_percentage(x: pd.Series) -> pd.Series:
x = x.str.replace("%", "")
x = x.astype(float) / 100
return x


def _parse_money(x: pd.Series) -> pd.Series:
x = x.str.replace("$", "").str.replace(",", "")
x = x.astype(float)
return x


def preprocess_companies(companies: pd.DataFrame) -> pd.DataFrame:
"""Preprocesses the data for companies.

Args:
companies: Raw data.
Returns:
Preprocessed data, with `company_rating` converted to a float and
`iata_approved` converted to boolean.
"""
companies["iata_approved"] = _is_true(companies["iata_approved"])
companies["company_rating"] = _parse_percentage(companies["company_rating"])
return companies


def preprocess_shuttles(shuttles: pd.DataFrame) -> pd.DataFrame:
"""Preprocesses the data for shuttles.

Args:
shuttles: Raw data.
Returns:
Preprocessed data, with `price` converted to a float and `d_check_complete`,
`moon_clearance_complete` converted to boolean.
"""
shuttles["d_check_complete"] = _is_true(shuttles["d_check_complete"])
shuttles["moon_clearance_complete"] = _is_true(shuttles["moon_clearance_complete"])
shuttles["price"] = _parse_money(shuttles["price"])
return shuttles


def create_model_input_table(
shuttles: pd.DataFrame, companies: pd.DataFrame, reviews: pd.DataFrame
) -> pd.DataFrame:
"""Combines all data to create a model input table.

Args:
shuttles: Preprocessed data for shuttles.
companies: Preprocessed data for companies.
reviews: Raw data for reviews.
Returns:
Model input table.

"""
rated_shuttles = shuttles.merge(reviews, left_on="id", right_on="shuttle_id")
model_input_table = rated_shuttles.merge(
companies, left_on="company_id", right_on="id"
)
model_input_table = model_input_table.dropna()
return model_input_table


import plotly.express as px
import pandas as pd
import plotly.graph_objects as go


# the below function uses plotly.express
def compare_passenger_capacity(preprocessed_shuttles: pd.DataFrame):
fig = px.bar(data_frame=preprocessed_shuttles.groupby(["shuttle_type"]).mean().reset_index(), x="shuttle_type", y="passenger_capacity", )
return fig

</code></pre>
<p><strong>pipeline.py</strong></p>
<pre class="lang-py prettyprint-override"><code>from kedro.pipeline import Pipeline, node
from kedro.pipeline.modular_pipeline import pipeline

from .nodes import create_model_input_table, preprocess_companies, preprocess_shuttles, compare_passenger_capacity


def create_pipeline(**kwargs) -> Pipeline:
return pipeline(
[
node(
func=preprocess_companies,
inputs="companies",
outputs="preprocessed_companies",
name="preprocess_companies_node",
),
node(
func=preprocess_shuttles,
inputs="shuttles",
outputs="preprocessed_shuttles",
name="preprocess_shuttles_node",
),
node(
func=create_model_input_table,
inputs=["preprocessed_shuttles", "preprocessed_companies", "reviews"],
outputs="model_input_table",
name="create_model_input_table_node",
),
node(
func=compare_passenger_capacity,
inputs="preprocessed_shuttles",
outputs="shuttle_passenger_capacity_plot",
),
],
namespace="data_processing",
inputs=["companies", "shuttles", "reviews"],
outputs="model_input_table",
)
</code></pre>
<p>Reference:
<a href="https://kedro.readthedocs.io/en/stable/tutorial/visualise_pipeline.html" rel="nofollow noreferrer">https://kedro.readthedocs.io/en/stable/tutorial/visualise_pipeline.html</a></p>
 

Latest posts

J
Replies
0
Views
1
jbowerbir
J
V
Replies
0
Views
1
Vinicius Martin
V
Top